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Inspiring Quotations 

A good many times I have been present at gatherings of people who, by the standards 
of traditional culture, are thought highly educated and who have with considerable gusto 
been expressing their incredulity at the illiteracy of scientists. Once or twice I have been 
provoked and have asked the company how many of them could describe the Second Law of 
Thermodynamics. The response was cold: it was also negative. Yet I was asking something 
which is about the scientific equivalent of: Have you read a work of Shakespeare's? 

-C. P. Snow, The Two Cultures and the Scientific Revolution 

. . . C. P. Snow relates that he occasionally became so provoked at literary colleagues who 
scorned the restricted reading habits of scientists that he would challenge them to explain 
the second law of thermodynamics. The response was invariably a cold negative silence. The 
test was too hard. Even a scientist would be hard-pressed to explain Carnot engines and 
refrigerators, reversibility and irreversibility, energy dissipation and entropy increase. . . all in 
the span of a cocktail party conversation. 

-E. E. Daub, "Maxwell's demon" 

He began then, bewilderingly, to talk about something called entropy . . . She did gather 
that there were two distinct kinds of this entropy. One having to do with heat engines, the 
other with communication. . . "Entropy is a figure of speech then" ... "a metaphor" . 

-T. Pynchon, The Crying of Lot 49 
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INTRODUCTION 



A. Overview 

This course surveys various uses of "entropy" concepts in the study of PDE, both linear 
and nonlinear. We will begin in Chapters I — III with a recounting of entropy in physics, with 
particular emphasis on axiomatic approaches to entropy as 

(i) characterizing equilibrium states (Chapter I), 

(ii) characterizing irreversibility for processes (Chapter II), 

and 

(hi) characterizing continuum thermodynamics (Chapter III). 

Later we will discuss probabilistic theories for entropy as 

(iv) characterizing uncertainty (Chapter VII). 

I will, especially in Chapters II and III, follow the mathematical derivation of entropy pro- 
vided by modern rational thermodynamics, thereby avoiding many customary physical ar- 
guments. The main references here will be Callen [C], Owen [O], and Coleman-Noll [C-N]. 
In Chapter IV I follow Day [D] by demonstrating for certain linear second-order elliptic and 
parabolic PDE that various estimates are analogues of entropy concepts (e.g. the Clausius 
inequality). I as well draw connections with Harnack inequalities. In Chapter V (conserva- 
tion laws) and Chapter VI (Hamilton-Jacobi equations) I review the proper notions of weak 
solutions, illustrating that the inequalities inherent in the definitions can be interpreted as 
irreversibility conditions. Chapter VII introduces the probabilistic interpretation of entropy 
and Chapter VIII concerns the related theory of large deviations. Following Varadhan [V] 
and Rezakhanlou [R] , I will explain some connections with entropy, and demonstrate various 
PDE applications. 

B. Themes 

In spite of the longish time spent in Chapters I — III, VII reviewing physics, this is a 
mathematics course on partial differential equations. My main concern is PDE and how 
various notions involving entropy have influenced our understanding of PDE. As we will 
cover a lot of material from many sources, let me explicitly write out here some unifying 
themes: 

(i) the use of entropy in deriving various physical PDE, 

(ii) the use of entropy to characterize irreversibility in PDE evolving in time, 

and 
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(iii) the use of entropy in providing variational principles. 
Another ongoing issue will be 

(iv) understanding the relationships between entropy and convexity. 

I am as usual very grateful to F. Yeager for her quick and accurate typing of these notes. 
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CHAPTER 1: Entropy and equilibrium 



A. Thermal systems in equilibrium 

We start, following Callen [C] and Wightman [W], by introducing a simple mathematical 
structure, which we will later interpret as modeling equilibria of thermal systems: 

Notation. We denote by (X ,X 1 , . . . ,X m ) a typical point of M m+1 , and hereafter write 

E = X . 

□ 

A model for a thermal system in equilibrium 

Let us suppose we are given: 

(a) an open, convex subset £ of M m+1 , 

and 

(b) a C^-function 

(1) S:Z^R 



such that 



(i) S is concave 



(2) { (ii) §>0 

(iii) S is positively homogeneous of degree 1. 

We call £ the state space and S the entropy of our system: 

(3) S = S(E,X 1 ,...,X m ) 

Here and afterwards we assume without further comment that S and other functions derived 
from S are evaluated only in open, convex regions where the various functions make sense. 
In particular, when we note that (2) (iii) means 

(4) S(XE, XX,, XX m ) = XS(E, X 1 ,...,X m ) (A > 0), 

we automatically consider in (4) only those states for which both sides of (4) are defined. 
Owing to (2)(ii), we can solve (3) for E as a C l function of (S, X±, . . . , X m ): 

(5) E = E(S,X 1 ,...,X m ). 
We call the function E the internal energy. 
Definitions. 

T = H = temperature 
^ P k = — = k th generalized force (or pressure). 
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Lemma 1 (i) The function E is positively homogeneous of degree 1 : 
(7) E(XS, XX h XX m ) = XE(S, X u ..., X m ) (A > 0). 

(ii) The functions T,P k (k = 1, . . . ) are positively homogeneous of degree 0: 



(8) 



T(XS, XX±, . . . , \X m ) — T(S, Xi, . . . , X m ) 

P k (XS, XX 1 , . . . , \X m ) = Pk(S, X 1 , . . . , X m ) (A > 0). 



We will later interpret (2), (7) physically as saying the S, E are extensive parameters and we 
say also that Xi, . . . , X n are extensive. By contrast (8) says T, P k are intensive parameters. 

Proof. 1. W = E(S(W, X 1 ,...,X m ),X 1 ,..., X m ) for all W, X u . . . , X m . Thus 

XW = E(S(XW,XX 1 ,...,XX rn ),XX 1 ,...,XX m ) 

= E(XS(W, X 1 ,...,X m ),XX 1 ,..., XX m ) by (4). 

Write S = S(W, X x , . . . , X m ), W = E(S, X 1 ,...,X m )to derive (7). 

2. Since S is C 1 , so is E. Differentiate (7) with respect to S, to deduce 

BE dE 
A-^-(AS', XXi, . . . , XX m ) = \-Qg(S, Xi, . . . , X rn ). 

The first equality in (8) follows from the definition T = ||L The other equalities in (8) are 
similar. □ 

Lemma 2 We have 

(9) ^ = 1 **=^ (* = 1 m) 

Proof. T=f = (1)~ 1 . Also 

W = E(S(W, X 1 ,...,X m ),X 1 ,..., X m ) 

for all W,Xi,..., X m . Differentiate with respect to X k : 

BE OS dE 
= ^ + 



0S^dX k dX^ 



□ 
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We record the definitions (6) by writing 

m 

(10) dE = TdS - PkdX k Gibbs' formula. 

k=i 

Note carefully: at this point (10) means merely T = P k = — -§§- (k — 1, . . . , to). We will 
later in Chapter II interpret TdS as "infinitesimal heating" and Ylk=i PkdX k as "infinitesimal 
working" for a process. In this chapter however there is no notion whatsoever of anything 
changing in time: everything is in equilibrium. 

Terminology. The formula 

S = S(E, Xi, . . . , X m ) 

is called the fundamental equation of our system, and by definition contains all the thermody- 
namic information. An identity involving other derived quantities (i.e. T, P k (k — 1, . . . , to)) 
is an equation of state, which typically does not contain all the thermodynamic information. 

□ 



B. Examples 

In applications Xi, . . . , X m may measure many different physical quantities. 

1. Simple fluid. An important case is a homogeneous simple fluid, for which 

E = internal energy 

V = volume 

N = mole number 

(1) S = S(E,V,N) 

T = H = temperature 

dE 



P = pressure 

dE 
' dN 



H = —jMr= chemical potential. 



So here we take Xi = V, X 2 = N, where iV measures the amount of the substance 
comprising the fluid. Gibbs' formula reads: 

(2) dE = TdS - PdV - fidN. 



Remark. We will most often consider the situation that iV is identically constant, say 
N — 1. Then we write 

(3) S(E, V) = S(E, V, 1) = entropy/mole, 
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and so 



E = internal energy 

V = volume 

(4) S = S(E, V) = entropy 

T = H = temperature 

P = — §f = pressure 

with 

(5) dE = TdS - PdV. 

Note that S(E, V) will not satisfy the homogeneity condition (2)(iii) however. □ 

Remark. If we have instead a multicomponent simple fluid, which is a uniform mixture of 
r different substances with mole numbers N±, . . . , N r , we write 

S = SiE^N,,...,^) 

fj,j = —-§§-= chemical potential of j th component. 

□ 

2. Other examples. Although we will for simplicity of exposition mostly discuss simple 
fluid systems, it is important to understand that many interpretations are possible. (See, 
e.g., Zemansky [Z].) 

Extensive parameter X Intensive parameter P = — |^ 

length tension 

area surface tension 

volume pressure 

electric charge electric force 

magnetization magnetic intensity 

Remark. Again to foreshadow, we are able in all these situations to interpret: 

PdX = "infinitesimal work" performed 

by the system during some process 

/ \ 

"generalized force" "infinitesimal displacement" 

□ 
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C. Physical interpretations of the model 

In this section we provide some nonrigorous physical arguments supporting our model in 
§A of a thermal system in equilibrium. We wish therefore to explain why we suppose 



(See Appendix B for statements of "physical postulates".) 

1. Equilibrium 

First of all we are positing that the "thermal system in equilibrium" can be completely 
described by specifying the (m + 1) macroscopic parameters X , X 1: . . . , X m , of which E = 
X , the internal energy, plays a special role. Thus we imagine, for instance, a body of fluid, 
for which there is no temporal or spatial dependence for E, Xi, . . . , X rn . 

2. Positivity of temperature 

Since |^ = ^, hypothesis (ii) is simply that the temperature is always positive. 

3. Extensive and intensive parameters 

The homogeneity condition (iii) is motivated as follows. Consider for instance a fluid 
body in equilibrium for which the energy is E, the entropy is S, and the other extensive 
parameters are X^ (k — 1, . . . , m). 

Next consider a subregion # 1, which comprises a X th fraction of the entire region (0 < 
A < 1). Let S 1 , E 1 , . . . , X\ be the extensive parameters for the subregion. Then 




(1) 




XE 

XX k (k = 1, . . . ,m) 
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Consider as well the complementary subregion # 2, for which 

S 2 = (1-X)S 

E 2 = (1-X)E 

X 2 = (l-X)X k (k = l,...,m). 
Thus 







= s x + s 2 




(2) | 




= E 1 + E 2 






[ x k -- 


- Xl+X 2 (fc = l,.. 


. , m 



The homogeneity assumption (iii) is just (1). As a consequence, we see from (2) that 
S,E, . . . , X m are additive over subregions of our thermal system in equilibrium. 

On the other hand, if T 1 , . . . are the temperatures and generalized forces for subregion 
# 1, and T 2 , . . . , P|, . . . are the same for subregion # 2, we have 

J T = T 1 = T 2 

\ P fc = P, 1 = P 2 (fc = l,...,m), 

owing to Lemma 1 in §A. Hence T, . . . , P k are intensive parameters, which take the same 
value on each subregion of our thermal system in equilibrium. 

4. Concavity of S 

Note very carefully that we are hypothesizing the additivity condition (2) only for sub- 
regions of a given thermal system in equilibrium. 

We next motivate the concavity hypothesis (i) by looking at the quite different physical 
situation that we have two isolated fluid bodies A, B of the same substance: 



Here 
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for the same function S(-, ■ ■ ■). The total entropy is 

S A + S B . 

We now ask what happens when we "combine" A and B into a new system C, in such a way 
that no work is done and no heat is transferred to or from the surrounding environment: 



S c, E c, .., x c 

k 



(In Chapter II we will more carefully define "heat" and "work".) After C reaches equilibrium, 
we can meaningfully discuss S c , E , . . . , X^ , .... Since no work has been done, we have 

X° = X£ + X» (fc = l,...,m) 

and since, in addition, there has been no heat loss or gain, 

E c = E A + E B . 

This is a form of the First Law of thermodynamics. 

We however do not write a similar equality for the entropy S. Rather we invoke the 
Second Law of thermodynamics, which implies that entropy cannot decrease during any 
irreversible process. Thus 

(3) S c > S A + S B . 

But then 



S c = S{E C ,...,XC 



k i ■ ■ ■ 

A 



(4) 



> S A + S B 

= S(E A ,...,X A ,...) + S(E B ,...,X B 



k ' 



This inequality implies S is a concave function of (E, X i: . . . , X m ). Indeed, if < A < 1, we 
have: 

S(\E A + (1 - \)E B , ...,XX A + (1- \)X B , ...) 
>S(XE A ,...,XX A ,...) + S((l-X)E B ,...,(l-X)X B ,...)by (4) 
= XS(E A ,...,X A ,...) + (l-X)S(E B ,...,X B ,...)by (iii). 
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Thus S is concave. 



5. Convexity of E 

Next we show that 
(5) E is a convex functionof (S, X±, . . . , X m ). 

To verify (5), take any S A , S B , X A , . . . , X A , X? , . . . , X B , and < A < 1. Define 

h, .- t,{b , A 1 , . . . , A m ) 
E B :=E(S B ,X B ,...,X B ); 



so that 

?A _ Qf J7>A vA vA 
i ■ ■ ■ i -^m 



S A = S(E A ,X A ,...,X^ 
S B = S(E B ,X B ,...,X B ). 



Since S is concave, 



S(XE A + (1-X)E B ,...,XX A + (1-X)X B ,...) 
(6) > XS(E A , . . . ,X A , . ..) 

+(1-X)S(E B ,...,X B ,...). 

Now 

W = E(S(W,...,X k ,...),...,X k ,...) 
for all W,Xi,..., X m . Hence 

XE A + (1 — X)E B = E{S{XE A + {1-X)E B ,...,XX A 

+(1-X)X B ,...),...,XX A + (1-X)X B ,...) 
> E(XS(E A ,...,X k A ,...) 

+(1-X)S(E B ,...,X B ,...),...,XX A + (1-X)X B ,...) 

owing to (6), since |^ = T > 0. Rewriting, we deduce 

XE(S A ,...,X A ,...) + (1-X)E(S B ,...,X B ,...) 
> E(XS A + (1 - X)S B , ...,XX A + (1- X)X B , . . . ), 

and so E is convex. □ 

6. Entropy maximization and energy minimization 
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Lastly we mention some physical variational principles (taken from Callen [C, p. 131 
137]) for isolated thermal systems. 

Entropy Maximization Principle. The equilibrium value of any unconstrained internal 
parameter is such as to maximize the entropy for the given value of the total internal energy. 

Energy Minimization Principle. The equilibrium value of any unconstrained internal 
parameter is such as to minimize the energy for the given value of the total entropy. 



graph of S = S(E, .,X k , .) 
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The first picture illustrates the entropy maximization principle: Given the energy constraint 
E = E*, the values of the unconstrained parameters (X ± , . . . , X m ) are such as to maximize 

(X 1 , . . . , X m ) i— > S(E*, Xi, . . . , X m ). 

The second picture is the "dual" energy minimization principle. Given the entropy constraint 
S = S*, the values of the unconstrained parameters (Xi, . . . , X m ) are such as to minimize 

(Xi, . . . , X m ) i E(S*, Xi, . . . , X m ). 



D. Thermodynamic potentials 

Since E is convex and S is concave, we can employ ideas from convex analysis to rewrite 
various formulas in terms of the intensive variables T = |-|, = — -§§- (k — 1, . . . , m). The 
primary tool will be the Legendre transform. (See e.g. Sewell [SE], [El, §111. C], etc.) 

1. Review of Legendre transform 

Assume that H : R™ — > (—00, +00] is a convex, lower semicontinuous function, which is 
proper (i.e. not identically equal to infinity). 

Definition. The Legendre transform of L is 

(1) L(q) = sup (p-q-H(p)) (qeR n ). 



We usually write L = H*. It is not very hard to prove that L is likewise convex, lower 
semicontinuous and proper. Furthermore the Legendre transform of L = H* is H : 

(2) L = H*, H = L*. 

We say H and L are dual convex functions. 

Now suppose for the moment that H is C 2 and is strictly convex (i.e. D 2 H > 0). Then, 
given q, there exists a unique point p which maximizes the right hand side of (1), namely 
the unique point p = p(q) for which 

(3) q = DH(p). 
Then 

(4) L(q) = p ■ q — H(p), p = p(q) solving (3). 
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Furthermore 
and so 



DL(q) = p+(q-DH(p))D q p 

= pby(3), 



(5) 



p = DL(q). 



Remark. In mechanics, H often denotes the Hamiltonian and L the Lagrangian. 



□ 



2. Definitions 

The energy E and entropy S are not directly physically measurable, whereas certain of 
the intensive variables (e.g. T, P) are. It is consequently convenient to employ the Legendre 
transform to convert to functions of various intensive variables. Let us consider an energy 
function 



where we explicitly take Xi — V — volume and regard the remaining parameters X 2 , . . . ,X, 
as being fixed. For simplicity of notation, we do not display (X 2 , . . . , X m ), and just write 



There are 3 possible Legendre transforms, according as to whether we transform in the 
variable S only, in V only, or in (S, V) together. Because of sign conventions (i.e. T — |^ 
P = — and because it is customary in thermodynamics to take the negative of the 
mathematical Legendre transform, the relevent formulas are actually these: 

Definitions, (i) The Helmholtz free energy F is 



E — E(S,V,X 2 , . . . ,X m ), 



(6) 



E = E(S, V). 



(7) 



F(T,V) 



mi(E(S, V)-TS). 



i 



(ii) The enthalpy H is 




H(S,P) 



in£(E(S,V) + PV). 



(in) The Gibbs potential (a.k.a. free enthalpy) is 



(9) 



G(T,P) 




The functions E, F, G, H are called thermodynamic potentials. 



The symbol A is also used to denote the Helmholtz free energy. 
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Remark. The "inf" in (7) is taken over those S such that (S, V) lies in the domain of E. 
A similar remark applies to (8), (9). □ 

To go further we henceforth assume: 
(10) E is C 2 , strictly convex 

and furthermore that for the range of values we consider 



(11) 



the "inf" in each of (7), (8), (9) is attained at 
a unique point in the domain of E. 



We can then recast the definitions (7)-(9): 
Thermodynamic potentials, rewritten: 

dE 

(12) E = E-TS, where T = — 

(JO 

BE 

(13) H = E + PV, where P = - — 

oV 

(14) G = E-TS + PV, where T=—, P = - — . 

More precisely, (12) says F(T, V) = E(S, V) - TS, where S = S(T, V) solves T = §§ (S, V). 
We are assuming we can uniquely smoothly solve for S = S(T, V). 

Commentary. If E is not strictly convex, we cannot in general rewrite (7)-(9) as (12)-(14). 
In this case, for example when the graph of E contains a line or plane, the geometry has the 
physical interpretation of phase transitions: see Wightman [W]. □ 

Lemma 3 

(i) E is locally strictly convex in (S, V) . 

(ii) F is locally strictly concave in T , locally strictly convex in V . 

(iii) H is locally strictly concave in P , locally strictly convex in S . 

(iv) G is locally strictly concave in (T, P) . 

Remark. From (9) we see that G is the inf of affine mappings of (T, P) and thus is con- 
cave. However to establish the strict concavity, etc., we will invoke (10), (11) and use the 
formulations (12)— (14) . Note also that we say "locally strictly" convex, concave in (ii)-(iv), 
since what we really establish is the sign of various second derivatives. □ 
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Proof. 1. First of all, (i) is just our assumption (10). 
2. To prove (ii), we recall (12) and write 



(15) 
where 

(16) 

Then (15) implies 

Thus 
(17) 

Next differentiate (16): 
Thus (17) gives: 



F(T, V) = E(S(T, V), V) - TS(T, V), 





T = 


dF 


dE as 


dT 


dS dT 


dF 


dE as 


dV 


dS dV 



dE 
~dS 



(S(T,V),V). 



1 dT 



-s 



+ 



dE 
dV 



rpdS _ dE 
dV dV 




_dS 

dT 
d 2 E as 
dVdS dV 



+ 



d 2 E 
dV 2 ' 



d 2 E dS 
dS 2 dT 
d 2 E dS 
dS 2 dV 



+ 



d 2 E 
dSdV 



d 2 F 
dT 2 

d 2 F 
dV 2 



\dS 2 J 



d 2 E 
dV 2 



( d 2 E \ ( d 2 E 
\ dSdV ) \ dS 2 



Since E is strictly convex: 

d 2 E 



OS 2 



>0, 



d 2 E d 2 Ed 2 E f d 2 E \ 2 
W 2>0, 'dS 2 dV 2 > \dSdV ) ' 



Hence: 



dT 2 

This proves (ii), and (iii),(iv) are similar. 
3. Maxwell's relations 



d 2 F d 2 F 

< 0, > 0. 



dV 2 



□ 



Notation. We will hereafter regard T, P in some instances as independent variables (and 
not, as earlier, as functions of S, V). We will accordingly need better notation when we com- 
pute partial derivatives, to display which independent variables are involved. The standard 
notation is to list the other independent variables outside parenthesis. 
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For instance if we think of S as being a function of, say, T and V, we henceforth write 



rasr 

\df;v 

to denote the partial derivative of S in T, with V held constant, and 

dVj T 

to denote the partial derivative of S in V, T constant. However we will not employ paren- 
thesis when computing the partial derivatives of E,F,G,H with respect to their "natural" 
arguments. Thus if we are as usual thinking of F as a function of T, V, we write not 

\&r)v' 

We next compute the first derivatives of the thermodynamic potentials: 
Energy. E = E(S, V) 

dE _ BE 

(!8) ^ = T > ^7 = -P- 



OS ' dV 



Free energy. F = F(T, V) 



OF OF 

< 19 > af— s 'W- - p 



Enthalpy. H = H(S, P) 



dS ' OP 



Gibbs potential. G = G(T, P) 



(21) ™--S ™-V 

{21) or ~ 6 ' op ~ v - 

Proof. The formulas (18) simply record our definitions of T, P. The remaining identities 
are variants of the duality (3), (5). For instance, F = E-TS, where T — ||, S — S(T, V). 
So 

OF 9E(dS\ __o_rp(dS^\ 

dT dS \dTJV \dTJV 

= -s, 

as already noted earlier. □ 
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We can now equate the mixed second partial derivatives of E, F, G, H to derive further 
identities. These are Maxwell's relations: 



(22) 



\dv) s = - \dS,, 



m s -m 



m (If) < dv 



dPJ T \dT /p 
The equality (22) just says = J^; (23) says = etc. 
E. Capacities 

For later reference, we record here some notation: 

(1) Cp = T (^gj^j = heat capacity at constant pressure 



'dS\ 

( 2 ) Cy = T [ ~?y^J — heat capacity at constant volume 



' dS\ 

(3) Ap = T [ — — I = latent heat with respect to pressure 

\dPJ T 



' dS\ 

(4) Ay = T I — — ) = latent heat with respect to volume 

dVJ T 



(5) (3 = — ( 7^ ) = coefficient of thermal expansion 



V \dT 
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1 ( d \^ N 

(6) K T = — — I — — ] = isothermal compressibility 

V \oP / 



(7) = — — ( 7-^ J = adiabatic compressibility. 

V \ Or J s 

(See [B-S, p. 786-787] for the origin of the terms "latent heat", "heat capacity".) 
There are many relationships among these quantities: 

Lemma 4 

0) Cv = (§) 



v 
p 



(ii) C P = (f ) 
(hi) C P > C v > 

w A V -p=m T - 

Proof. 1. Think of E as a function of T,V; that is, £ = E(S(T, V),V), where S(T, y) 
means S as a function of T, V. Then 

^ =t(£\ =c v . 



\dTjy dS \dTJy \dT y , t 
Likewise, think of H = H(S(T, P),P). Then 

8H\ 8Hf8S\ = /«SN =Cp> 



dT J p dS \dT J p \dT J p 



where we used (20) from §D. 
2. According to (19) in §D: 



Thus 



OF 

b ~ dT' 



dS\ d 2 F 



(8 > Cv=T {w) v -- T ^ >0 - 

since T i— > F(T, V) is locally strictly concave. Likewise 

dG 

b ~~df 

owing to (21) in §D; whence 

'dS\ „d 2 G 
p 



(»> C P = T|- =-r— >0. 
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3. Now according to (12), (14) in §D: 

G = F + PV; 



that is, 



Consequently: 



and so 



G(T, P) = F(T, V) + PV, where 
V = V(T, P) solves §£(T, V) = -P. 



dG dF , (dF , p\ (dV\ 

&T dT \ dV ) \ dT ) p 

dF 

9T' 



d 2 G _ d 2 F dF fdV 
( ' df 2 ~df 2 + dTdV [dfy P 

But differentiating the identity 8F/dV(T, V) = —P, we deduce 

d 2 F (PF_ fdV_\ _ Q 



dVdT dV 2 \dTJp 
Substituting into (10) and recalling (8), (9), we conclude 



n n t ( 9 2 f d 2 o\ 

Up-l-v - i y-^ - QT2j 



2 



d 2 F/dV 2 \dVdT J — u ' 



since V h- > F(T, V) is strictly convex. 

This proves (iii). Assertion (iv) is left as an easy exercise. □ 

Remark. Using (19) in §D, we can write 

(12) C P - C v = -T (|0 / (|0 (Kelvin's formula). 



F. More examples 
1. Ideal gas 

An ideal gas is a simple fluid with the equation of state 
(1) PV = BT, 



□ 
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where R is the gas constant (Appendix A) and we have normalized by taking N = 1 mole. 
As noted in §A, such an expression does not embody the full range of thermodynamic in- 
formation available from the fundamental equation S = S(E, V). We will see however that 
many conclusions can be had from (1) alone: 

Theorem 1 For an ideal gas, 

(i) Cp,Cy are functions ofT only: 

C P = Cp(T), C V = C V (T). 

(ii) C P - C v = R. 

(iii) E is a function of T only: 

(2) E = E(T) = [ C v {6)d0 + E . 

JTo 

(iv) S as a function of (T, V) is: 

(3) s = s{T,v) = R\o g v+ [ ( ^-de + s . 

Formulas (2), (3) characterize E, S up to additive constants. 
Proof. 1. Since E = E(S, V) = E(S(T, V),V), we have 

(dE\ dE (dS_\ _,_ dE 

\dVJT dS \dVJT ~^ dV 

= T(^) -P 

= T(^) -P 
1 \&t)v ' 

where we utilized the Maxwell relation (23) in §D. But for an ideal gas, T (f^) y — P — 
™ — P = 0. Consequently (§f|) T = 0. Hence if we regard E as a function of (T, V), E in 
fact depends only on T. But then, owing to the Lemma 4, 



C 



v 



dE\ dE 



dT J v dT 



depends only on T. 

2. Next, we recall Kelvin's formula (12) in §E: 



Cp — C 



-T /<9P X 2 



v 



(9P) \ dT ) 
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Since PV = RT, we have 

( fdP' 

It v 2 ' 
) = E 

>V V 

Thus 



(dP\ _ RT 

(dP\ _ R 
\dT/] 



t (Te- 
rn \ V2 



C P -Cy = {^)=R. 



As R is constant and Cy depends only on T, Cp likewise depends only on T. 
3. Finally, think of S as a function of T, V: 

S = S{E{T, V), V) = S(E(T), V). 

Then 

dS\ d^dE 1 
dTj v dEdT T 

(dS\ dS_P_R 
[dv) T ~ dV ~ T ~ V 

Formula (3) follows. □ 

Remark. We can solve (2) for T as a function of E and so determine S = S(T(E), V) as a 
function of (E, V). Let us check that 



provided 
Now 

Thus 



(E, V) I— > S is concave, 



C v > 0. 



( £>S\ &T _ lf-1 (rp\ 1 _ 1 

\dTJv dE T^yy 1 > C V {T) T 
( 8S\ __ (dP\ _ R 
\dV)T \8TJv V 

T 2 dE T 2 C V (T) ^ u 

= 

and so (E, V) \- > S is concave. □ 
Remark. Recalling §B, we can write for iV > moles of ideal gas: 

(4) PV = NRT, 



dS 




dE 




c)S 




dV 






d 2 S 




dE 2 


J 


a 2 s 




dEdV 




d 2 S 




dV 2 
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and 

S(E,V,N) = NS(§,%,1) 

The function S of 
(E, V, N) from §B 

' E V 



The function S of (E, V) from §B. 

We note next that 

(5) (E, V,N) i-> S is concave. 

Indeed if < A < 1, N, N > 0, then: 

S(XE + (1 - X)E, XV + (1 - A)t>, XN + (1 - A)A>) 
= (A J V + (l-A) J V) 5 (i±^,i±!l E §) 

= (AJV + (1 - A)JV)S (,cf + (1 - + (i _ M ) v 

where 

A AT (1-A)A> 
^ = 77; — — ^. ~ 



XN + (1-X)N XN+(l-X)N 
Since (E, V) \- > 5" is concave, we deduce: 

S(A£ + (1 - A)£, AV + (1 - A)t> AA r + (1 - A)A>) 
> (XN + (1 - A)A>) (#, J) + (1 - /i)^ (|, |) 
= AA^(#,]Q + (l-A)A>s(§,f) 
= AS(£, V, AT) + (1 — A)S(£, V\ A>). 

This proves (5). □ 
A simple ideal gas is an ideal gas for which Cy (and so Cp) are positive constants. Thus 

(6) 7 := ^ > I- 
From (2), (3), we deduce for a simple ideal gas that 

(7) B = C ^ 

v ; S = R\ogV + C v hgT + S 
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where we have set E — 0. Thus for a simple ideal gas 

S(E,V) = R\ogV + C v \ogE + S (N = l) 
[ ' S(E,V,N) = NR\og(%) + NC v \og(§) + S N. 

(The constants S in (7), (8) and below differ.) For later reference we record: 

S = C v \og(TV^ 1 ) + S 

u s = c v log(pyT) + s 

where 7 = C P - C v = R, N = 1. 
2. Van der Waals fluid 

A van der Waals fluid is a simple fluid with the equation of state 

RT n 

(io) p = V^b-v^ ( v > b ^ N = 1 ) 

for constants a, b > 0. 

Theorem 2 For a van der Waals fluid, 

(i) Cy is a function of T only: 

C v = C V {T). 

(ii) E as a function of (T, V) is: 



(11) E = E(T,V)= I C v {9)d9 - £ + E . 
(iii) S as a function of (T, V) is 

(12) S = S(T,V) = R\og(V-b)+ I c ^-de + s . 



Proof. I. As in the previous proof, 



dE \ =T ( 9 S\ -p. 



dVJ T \dT 



v 



But P = — ^2 and so (f^) T = yi- Hence if we think of E as a function of (T, V), we 
deduce 

£ = — — + (a function of T alone). 
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But then 



v 



depends only on T. Formula (11) follows. 
2. As before, S = S(E(T, V),V). Then 



(9S_\ 9S_ (8E\ 1(~< (rp\ 

\dTJV dE\dT)v J* >i 

( dS\ dS (dE\ | 

\dV>T " 8E \dV>T 



dS_ 
dV 

_ l_a_ I P 
T V 2 T 

P 
V-b' 

Formula (12) results upon integration. □ 
Note. Cp depends on both T and V for a van der Waals fluid. □ 
We can define a simple van der Waals fluid, for which Cy is a constant. Then 

S = R\og{V -b) + C v \ogT + S , 1 ; 

However if we solve for S = S(E, V), S is not concave everywhere. Thus a van der Waals fluid 
fits into the foregoing framework only if we restrict attention to regions S where (E, V) \— > S 
is concave. 

Remark. More generally we can replace S by its concave envelope (= the smallest concave 
function greater than or equal to S in some region). See Callen [C] for a discussion of the 
physical meaning of all this. □ 
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CHAPTER 2: Entropy and irreversibility 



In Chapter I we began with an axiomatic model for a thermal system in equilibrium, and 
so could immediately discuss energy, entropy, temperature, etc. This point of view is static 
in time. 

In this chapter we introduce various sorts of processes, involving changes in time of the 
parameters in what we now call a thermodynamic system. These, in conjunction with the 
First and Second Laws of thermodynamics, will allow us to construct E and S. 

A. A model material 

We begin by turning our attention again to the example of simple fluids, but now we 
reverse the point of view of Chapter I and ask rather: How can we construct the energy E 
and entropy S? We will follow Owen [0] (but see also Bharatha-Truesdell [B-T]). 

1. Definitions 

Since we intend to build E, S, we must start with other variables, which we take to be 
T,V. 

A model for a homogeneous fluid body (without dissipation) 

Assume we are given: 

(a) an open, simply connected subset E C (0, oo) x (0, oo) (£ is the state space and 
elements of E are called states) 

and 

(b) C 1 -functions P, Ay, Cy defined on E [P is the pressure, Ay the latent heat with 
respect to volume, Cy the heat capacity at constant volume) 

Notation. We write P = P(T,V), Ay = A V (T,V), Cy = C V (T,V) to display the depen- 
dence of the functions P, Ay, Cy on (T, V). □ 

We further assume: 

dP 

(1) — < 0, Ay ^ 0, Cy > in E. 

2. Energy and entropy 

a. Working and heating 
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We define a path T for our model to be an oriented, continuous, piecewise C 1 curve in E. 
A path is a cycle if its starting and endpoints coincide. 




Notation. We parameterize T by writing 

T = {(T(t),V(t)) for a < t < b}, 

where a < b and V, T : [a, b] — > R are C 1 . 
Definitions, (i) We define the working 1-form 

(2) cTW^ = PrfV 
and define the work done by the fluid along T to be 

(3) W(r) = J dW = J PdV. 

(ii) We likewise define the heating 1-form 

(4) dQ = C v dT + A v dV 
and define the net heat gained by the fluid along T to be 

(5) Q(r) = J dQ = J C v dT + A v dV. 



Remarks, (a) Thus 



w(r) = / p(T(t),v{t))v(t)dt 

J a 



d 

dt 



□ 
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and 

Q(H= f C v (T(t),V(t))f(t)+A v (T(t),V(t))V(t)dt. 

J a 

We call 

(6) w(t) = P(T(t),V(t))V(t) 
the rate of working and 

(7) <z(f) = C v (T(f), V(t))T(*) + A v (T(t), V(t))V(t) 

the rafe of heating at time t {a <t < b). 

(b) Note very carefully that there do not in general exist functions W, Q of (T, V) 
whose differentials are the working, heating 1-forms. The slash through the "d" in dW, dQ 
emphasizes this. 

Consequently W(r), Q(r) depend upon the path traversed by F and not merely upon its 
endpoints. However, W(r), Q(r) do not depend upon the parameterizations of T. □ 

Physical interpretations. (1) If we think of our homogeneous fluid body as occupying the 
region U (t) C M 3 at time t, then the rate of work at time t is 



w(t) = [ Pvv dS, 

JdU(t) 



v denoting the velocity field and v the outward unit normal field along dU{t). Since we 
assume P is independent of position, we have 



w ( f ) = P Sdu^-vdS = Pi(J m dx 



= PV(t), 

in accordance with (6). 

(2) Similarly, Ay records the gain of heat owing to the volume change (at fixed tem- 
perature T) and Cy records the gain of heat due to temperature change (at fixed volume 
V). □ 

Definitions. Let Y = {(T(t), V(t)) | a < t < b} be a path in S. 

(i) T is called isothermal if T(f) is constant (a < t < b). 

(ii) T is called adiabatic if q(t) = (a < t < b). 

Construction of adiabatic paths. Since q(t) = C v (T(t), V(t))f(t)+A v (T(t), V(t))V(t), 
(a <t < b), we can construct adiabatic paths by introducing the parameterization (T, V{T)) 
and solving the ODE 

(8) — — = — V ^ ' — ODE for adiabatic paths 

dT Ay[V,T) 
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for V as a function of T, V — V(T). Taking different initial conditions for (8) gives different 
adiabatic paths (a.k.a. adiabats). 

Any C 1 parameterization of the graph of V = V(T) gives an adiabatic path. 

b. The First Law, existence of E 

We turn now to our basic task, building E, S for our fluid system. The existence of 
these quantities will result from physical principles, namely the First and Second Laws of 
thermodynamics. 

We begin with a form of the First Law: We hereafter assume that for every cycle T of 
our homogeneous fluid body, we have: 

(9) w(r) = Q(r). 



This is conservation of energy: The work done by the fluid along any cycle equals the 
heat gained by the fluid along the cycle. 

Remark. We assume in (9) that the units of work and heat are the same. If not, e.g. if heat is 
measured in calories and work in Joules (Appendix A), we must include in (9) a multiplicative 
factor on the right hand side called the mechanical equivalent of heat (= 4.184J/calorie). 

□ 

We deduce this immediate mathematical corollary: 

Theorem 1 For our homogeneous fluid body, there exists a C 2 function E : E — > R such 
that 

(10) «f = Av = 

We call E = E(T, V) the internal energy. 
Proof. According to (3), (5), (9): 



J C v dT + (Ay - P)dV = 



for each cycle in E. The 1-form CydT + (Ay — P)dV is thus exact, since E is open, simply 
connected. This means there exists a C 2 function E with 

(11) dE = C v dT + (A y - P)dV. 

This statement is the same as (10). □ 
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Notation. From (11), it follows that 

dE = dQ-dW 

(12) ^ \ \ 

exact 1-form non-exact 1 -forms 



c. Carnot cycles 

Definition. A Carnot cycle T for our fluid is a cycle consisting of two distinct adiabatic 
paths and two distinct isothermal paths, as drawn: 

V 





(We assume Ay > for this picture and take a counterclockwise orientation.) 
We have Q(r 6 ) = Q(T d ) = 0, since T b , T d are adiabatic paths. 

Notation. 

Q~ = — Q(r c ) = heat emitted at temperature T\ 

Q+ = Q(r o ) = heat gained at temperature T 2 

Q = W(r) = Q+ - Q- = work. 

Definition. A Carnot cycle T is a Carnot heat engine if 

Q+ > and <QT > 

heat is gained at heat is lost at 

the higher temperature T 2 the lower temperature Ti 

The picture above illustrates the correct orientation of T for a Carnot heat engine, provided 
Ay > 0. 
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Example. Suppose our fluid body is in fact an ideal gas (discussed in §I.F). Then PV = RT 
if we consider N — 1 mole, and 



(14) 



P(T,V) = !£, C v (T,V) = Cy(T), 
MT,V) 



v 

RT 

V ■ 



(The formula for Ay is motivated by our recalling from §I.E that we should have Ay = 
T (fy) T = T {jjp) v = ^y--) Consider a Carnot heat engine, as drawn: 



v 3 



V=V 2 (T) 



V=V! (T) 



We compute 
(15) 



T 2 



A v dV = RT 2 \og( ^ 
Vi V v i 



r 



The equation for the adiabatic parts of the cycle, according to (8), is: 

dV C v VC V {T) 
dT ~ ~Ay~ ~~ RT 

Hence the formulas for the lower and upper adiabats are: 

Vi(T) = Vlexp(-^^f^) 
V 2 (T) = V 2 e W [-fl^d6), 

and so 



V 4 = V 1 (T 1 ) = V 1 exp(-f£CyMd6) 



V 3 = ^ 2 (T 1 ) = V 2 exp(-/^^d6l]. 
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Therefore 



The work is 



-f^AvdV = -iZTilogfg 



-i27\log(^) >0. 



W = Q+ - Q" 

= R(T 2 - 71) log (|) 
> 0; 

and for later reference we deduce from (15) that 

TV 



(16) W 



d. The Second Law 



for a Carnot cycle of an ideal gas. 



□ 



We next hypothesize the following form of the Second Law of thermodynamics: For 
each Carnot heat engine T of our homogeneous fluid body, operating between temperatures 
Ti < T 2 , we have 

W > 



and 
(17) 



W 



1 - 



To 



(18) 



In other words we are assuming that formula (17), which we showed above holds for any 
Carnot heat engine for an ideal gas, in fact holds for any Carnot heat engine for our general 
homogeneous fluid body. 

Physical interpretation. The precise relation (17) can be motivated as follows from this 
general, if vague, statement, due essentially to Clausius: 

"no system which employs homogeneous fluid bodies 
operating through cycles can absorb heat at one temperature 
Ti and emit the same amount of heat at a higher 
temperature T 2 > Ti, without doing work on its environment". 
Let us first argue physically that (18) implies this assertion: 

"If T,r are two Carnot heat engines (for possibly 
different homogeneous fluid bodies) and T, V both operate 
between the same temperatures T 2 > Ti, then 
W = W implies Q+ = Q+" 



(19) 



35 



This says that "any two Carnot cycles which operate between the same temperatures 
and which perform the same work, must absorb the same heat at the higher temperature" . 

Physical derivation of (19) from (18). To see why (19) is in some sense a consequence 
of (18), suppose not. Then for two fluid bodies we could find Carnot heat engines T,T 
operating between the temperatures T 2 > Ti, such that 

W = W, but Q+ > Q+. 

Then since W = W, we observe 

(Q-_Q-)=Q+_Q+<0. 

Imagine now the process A consisting of T followed by the reversal of T" . Then A would 
absorb Q = — (Q~ — Q~) > units of heat at the lower temperature T\ and emit the same Q 
units of heat at the higher temperature. But since W — W = 0, no work would be performed 
by A. This would all contradict (18), however. 

Physical derivation of (17) from (19). Another way of stating (19) is that for a Carnot 
heat engine, Q + is some function 0(Ti,T 2 , W) of the operating temperatures Ti,T 2 and the 
work W, and further this function is the same for all fluid bodies. 
But (16) says 

Q+ = — ^— W = 0(T 1 ,T 2 ,W) 

^2 — J 1 

for an ideal gas. Hence (19) implies we have the same formula for any homogeneous fluid 
body. This is (17). □ 

Remark. See Owen [O], Truesdell [TR, Appendix 1A], Bharatha-Truesdell [B-T] for a more 
coherent discussion. □ 

e. Existence of S 

We next exploit (17) to build an entropy function S for our model homogeneous fluid 
body: 

Theorem 2 For our homogeneous fluid body, there exists a C 2 function S : E — > R such 
that 

dl = Av dl = Cy 

y ) dv t ' or t ' 

We call S = S(T, V) the entropy. 

Proof. 1. Fix a point (T*, V*) in £ and consider a Carnot heat engine as drawn (assuming 
A y > 0): 
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v 3 



V=V 2 (T) 




(V„TJ 



T 2 



Now 
(21) 

Furthermore 



rV 2 

/ A v (V,T 2 )dV. 

*T 2 r V 2 (T) Qp 



W = / PdV = / / — dV dT 

by the Gauss-Green Theorem. This identity, (17) and (21) imply 



V 2 rp i-T 2 rV 2 {T) an 

Av(V,T 2 )dV = / — 

Vi 1 1 ~ J l JTi JVi(T) 01 



dVdT. 



Let 7\ — ► T 2 = T*: 



V2 /-V2 QD 

A v (V,T*)dV = T* (y,T*)dV. 



Vi 



<9T 



Divide by V 2 — Vi and let V 2 — > VI = V~*, to deduce 

,9P 



(22) 



Ay = (Clapeyron's formula) 



at the point (T*,V"*). Since this point was arbitrary, the identity (22) is valid everywhere in 
2. Recall from (10) that 



(23) 



dE dE 
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Consequently: 



J_ ^Ayj _ 1 i>\\- A_i_ 



8T \ T ) T dT T 2 

l_ 

T 
_c 

~ dV 



Kl* + f)-Mi) by (22), (23) 
2- (^) by (23) again. 



Thus the form 



%dT + ^dV 
T T 



is exact: there exists a C 2 -function S such that 

(24) dS = °^dT + ^dV. 

This is (20). □ 
Notation. From (24) it follows that 

(25) dS=*£, 
and so (12) becomes Gibbs' formula: 

dE = TdS - PdV. 

S as a function of (E, V). We have determined E, S as functions of (T, V). To be consistent 
with the axiomatic approach in Chapter I, however, we should consider S as a function of 
the extensive variables (E,V). 

First, since |f = C v > 0, we can solve for T = T(E,V). Then P = P(T,V) = 
P(T(E, V), V) gives P as a function of (E, V). Also the formulas S = S(T, V) = S(T(E, V), V) 
display S as a function of (E, V). Consequently 



(26) 
and 
(27) 



( dS\ _ (dS_\ &T 
\dEJv \dTlv BE 



= T ^ (2°) 



(dS\ (dS_\ &T i (dS\ 

\dVJE \dTJv 8V \dVJT 



= + ^by(20). 

But ^^(W 7 ", V), V) = W for all W and so 

(dE\ dT fdE\ 

\dr) v dv + {dv) T ~ ■ 
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Hence (10) implies 

dT 

(28) C v — =P-A V . 

Consequently (27) says (§^) E = ^. In summary: 

(29) 



dEJ 



T' \dV 



P 

T' 



as expected from the general theory in Chapter I. 
Finally we check that 

(30) S is a concave function of (E, V) . 

For proving this, we deduce first from (29) that for S = S(E, V): 

d 2 S 1 



(31) 
Also 

Now 
and so 
But 

and (22) says: 

Thus 

(32) 



dE 2 C V T 2 < °' 

° _ \dVJE _ \dV)i 

g V 2 T T 2 

dP 
dV 



dP dP (dT* 

E ~dV + df [Wy j 



d 2 S 


1 dP 


1 


dV 2 


" TdV 




( 


dT\ 


dE 
dV 




dVj E = 


dE 
dT 



dP „ 

T P 

OT 



P-A 



v 



A V = T 



dP 



dv) E 



by (10), 



dT' 



d 2 S 1 dP 



1 



dV 2 T dV C V T 2 



since |£ < 0, Cy > 0. Lastly, 



d 2 S 
dEdV 



(dT\ 

Kdvh 

J^2 
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(P-A v ) 2 <0, 



A v -P 



T 2 C 



v 



Consequently (31), (32) imply 



( a 2 s\ ( a 2 s\ _ ( d 2 s \ 2 

\dE 2 J \dV 2 J \dEdV J 

< 33 > =(-c^)(m -c^(p-A V r) 



(Av-P) 2 



> 0. 



Owing to (31), (32), (33) S is a concave function of (E, V). □ 

3. Efficiency of cycles 

Recall from §2 that 

q(t) = rate of heating at time t 

= C v (T(t), V(t))f(t) + A v (T(t), V(t))V(t), 

where T = {(T(t), V(t)) \ a < t < b} is a path in S. 
Notation. 



(i) 



q(t) if > 
if q(t) < 

if q(t) > 
-q(t) if g(t) < 
q(t) = q + (t)-q-(t) (a < t < b) 



?"(*) 



(ii) 



(iii) 



"(r) = f q + (t)dt = heat gained along Y 
(f) = f£q~(t)dt = heat emitted along T 



W(r) = Q+(r) - Q"(r) = work performed along T. 
Definition. Assume T is a cycle. The efficiency of Y is 

/ ,N W ( r ) 

the ratio of the work performed to the heat absorbed. Note < rj < 1. 



40 



Example. If T is a Carnot heat engine operating between temperatures < Ti < T 2 , we 
have 



(35) 



77 = 1 



according to (16). □ 

Notation. Let T be an arbitrary cycle in E, with parameterization {(T(t),V(t)) \ a < t < 
b}. Let 



(36) 



Ti = min{T(t) | a < t < b} 
T 2 = max{T(t) | a < t < b} 



denote the highest and lowest temperatures occurring in the cycle. 




Theorem 3 Let T be a cycle as above, and let r\ denote its efficiency. Then 
(37) 77 < 1 — 



TV 



Proof. According to (20), 



Since T is a cycle, we therefore have 



(38) 



= 



dQ_ 
T 



T 



-dt. 
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Then 

rb 



® ~ la T T ^ 

(39) > ±f b a q+dt-±f b aq -dt, 

Q+(r) _ Q-(r) 

since g+ > 0, q~ > 0, Ti < T < T 2 on T. 
Consequently: 

_ w(r) . 1 _ Q-(r) 

'/ _ Q+(r) _ 1 Q+(r) 

< 1-^ 

- T 2 - 



□ 



We will later (in §B) take the efficiency estimate (37) as a starting point for more general 
theory. 

4. Adding dissipation, Clausius inequality 

We propose next, following Owen [O] and Serrin [SI], to modify our model to include 
irreversible, dissipative effects. 

Notation. Remember that we represent a parametric curve T by writing 
(40) (T(t),V(t)) for a < t < b, 

where a < b and V, T : [a, b] — > R are C 1 . We will henceforth call T a process to emphasize 
that the following constructions depend on the parameterization. We call T a cyclic process 
ii(T(a),V(a)) = (T(b),V(b)). 

A model for a homogeneous fluid body (with dissipation). Assume we are given: 

(a) a convex open subset E C (0, oo) x (0, oo) (E is the state space), 

and 

(b) two C 2 functions W, Q defined on E x M 2 . 
Notation. We will write 

W = W(T,V,A,B) 
Q = Q(T,V,A,B) 

where (T, V) G E, (A, B) e M 2 . □ 
We assume further that W, Q have the form: 

W(T,1/,A,5) = P(T,V)B + R ± (T, V,A,B) 
Q(T,V,A,B) = C V (T,V)A + A V (T,V)B + R 2 (T,V,A,B) 
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for all (T, V) £ E, (A, B) Gl 2 , where the remainder terms R±,R2 satisfy: 

(42) \Ri(T, V, A, B), R2(T, V, A, B)\ < C(A 2 + B 2 ) 

for some constant C and all (T,V), (A, B) as above. Lastly we suppose that P, CV,Ay 
satisfy: 

dP 

(43) — < 0, Ay ^ 0, C v > in E. 

Given a process T as above, we define the work done by the fluid along T to be 

(44) W(r) = J dW = J W(T(t),V(t),f(t),V{t))dt 
and the net heat gained by the fluid along T to be 

(45) Q(r) = J dQ = J" Q(T(t), V(t), f(t), V(t))dt. 
Notation. We call 

(46) w(t)=W(T(t),V(t),T(t),V(t)) 
the rate of working and 

(47) q(t) = Q(T(t), V(t),f(t), V(t)) (a<t<b) 

the rate of heating at time t. In this model the expressions " J r dW n and "J r dQ n are defined 
by (44), (45), but "W and "cfQ" are not defined. 

Remark. Note very carefully: W(r), Q(r) depend not only on the path described by T but 
also on the parameterization. □ 

Our intention is to build energy and entropy functions for our new model fluid with 
dissipation. As before we start with a form of the First Law: For each cyclic process T of 
our homogeneous fluid body with dissipation, we assume 

(48) w(r) = Q(r). 

This is again conservation of energy, now hypothesized for every cyclic process, i.e. for 
each cycle as a path and each parameterization. 

Theorem 4 There exists a C 2 function E : E — > R such that 

(49) °* = Cv ,§ = l r -P. 
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Proof. Consider any cyclic process T. 

We may assume a = 0, b > 0. Then (48) says 

f W(T(t), V(t),f(t), V(t))dt = f Q(T(t), V(t),f(t), V(t))dt. 
Jo Jo 

Owing to (41), we can rewrite: 

rb pb 

/ C v f + (Ay - P)Vdt = / R 1 (T,V,f,V)~R 2 (T,V,f,V)dt. 
Jo Jo 

This identity must hold for each parameterization. So fix e > and set 

(T e (t), V £ (t)) = (T(et), V(et)) (0 < t < b/e). 

Then 

(50) / ^ C v f £ + (A y - P)V e dt = i?i(T £ , V e , f £ , V e ) - R 2 (T £ , V e , f £ , V £ )dt. 
Jo Jo 

Since T £ (t) = eT(et), V £ (t) = eV(et), we can rewrite the left hand side of (51) as 

(51) J Cyf + (Ay - P)Vdt = J CydT + (Ay - P)dV, 

where we use 1-form notation to emphasize that this term does not depend on the 
terization. However (42) implies 

ff £ Ri{T £ , V e , f £ , V £ ) - R 2 (T £ , V £ , f £ , V £ )dt 

<Cfmax < t <5/ £ [(T £ ) 2 + (V;) 2 ] 
< Ce. 

Sending e — > 0, we conclude 

CydT + (Ay - P)dV = 



r 



for each cycle T in S, and the existence of E follows. 
Remark. In fact (48) implies 

(52) R,(T, V, A, B) = R 2 (T, V, A, B) 
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for all (T, V) G S, (A, B) e M 2 . To see this, consider any process T parameterized by 
(T(t), V(t)), where a < t < 0. Extend T to a cyclic process A parameterized by (T(t), V(t)) 
for a < t < 1. Then by the proof above: 



(53) / R 1 {T,V,f,V)dt = ! R 2 (T,V,f,V)dt. 

J a J a 

Reparameterize by writing 



Ut) = 



T(t) a<t<0 

T(et) 0<t<l/e, 

V(t) a<t<0 

V(et) 0<t< l/e. 



Formula (53) holds for any parameterization and so is valid with (T £ ,V £ ,T £ ,V £ ) replacing 
(T, V, T, V). Making this change and letting e — > 0, we deduce 



/ R 1 {T,V,f,V)dt= / R 2 (T,V,f,V)dt 

J a J a 



for any process (T(t), V(t)), a < t < 0. This is only possible if Ri = R 2 - □ 

The foregoing proofs demonstrate that our model with dissipation in some sense "ap- 
proximates an ideal model without dissipation" in the limit as we consider our processes 
on "slower and slower time scales". The model without dissipation corresponds simply to 
setting R\ = 0, R 2 = 0. 

To further develop our model we next assume as an instance of the Second Law that 

W > 

and 

(54) W = - ^ W 

for any Carnot heat engine of the ideal model without dissipation acting between temper- 
atures T 2 > T. This assertion as before implies there exists a C 2 function S : £ — > K 
with 

dS__Cy 95 _ Ay 
i ) dT ~ T ' dV ~ T 

Finally it seems reasonable to assume 
(56) Ri(T, V, A, B) = R 2 (T, V, A,B) <0 
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for all (T, V) £ E, (A, B) Gl 2 . Indeed we can interpret for any process V the term 

/ R^T^f^dt 

J a 

as representing the lost amount of work performed by the fluid along T owing to velocity 
dependent internal friction (= dissipation). 

Combining (56), (57), we deduce for any cyclic process I\ that 



(57) 



i: mwn dt = f^ t s(T,v)dt 

+ j b a R ^ v T T ' V) dt 
= f b dt 

J a 1 

< 0. 



We employ earlier notation to rewrite: 

(58) / — - < 0. T a cyclic process 

Jr T 

This is a form of Clausius' inequality. If T is a process from the state a = (Tq,V ) to 
j3 = (Ti, Vi), we likewise deduce 



(59) 



Lastly, note for our model that if T is a cyclic process with maximum temperature T 2 
and minimum temperature Ti, its efficiency is 

The proof is precisely like that in §3, except that we write 

q(t) = Q(T(t),V(t),f(t),V(t)) 

and start with the inequality 

> i*g= r b idt. 



r T J a T 

□ 

B. Some general theories 

In this section we discuss two representative modern theories for thermodynamics. 
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1. Entropy and efficiency 

First we follow Day-Silhavy [D-S] and introduce a new mathematical model for thermo- 
dynamic systems. 

a. Definitions 

Notation. (Y 1 , . . . , Y n ) = Y = typical point in R n . 

A model for a thermodynamic system in many variables 

We are given: 

(a) an open, simply connected subset E of M. n (E is the state space and elements of E are 
called states) 

and 

(b) two C 2 functions 

T : E -> (0,oo) 
A : E -> R n . 

T is the temperature and the components of A are the generalized latent heats 

A = (A\...,A"). 

Notation. 

(a) A path T is an oriented, continuous, piecewise C 1 curve in E. A path is a cycle if its 
starting and end points coincide. 

(b) If T is a path, its reversal T is the path taken with opposite orientation. 

(c) If Ti and T 2 are two paths such that the endpoint of I\ is the starting point of T 2 , 
we write 

r 2 *r! 

to denote the path consisting of Ti, followed by T 2 . 



a 
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Note. We parameterize T by writing {Y(t) \ a < t < b} 

Y(t) = (*i(t), ...,Y n (t))= state at time t 

T, T 2 * Ti have the obvious parameterizations. □ 
Definitions, (i) 



Q(r) = J r dQ = J r A.dY = f b a A(Y(t)).Y(t)dt 
= heat absorbed along T. 



(ii) 



q(t) = A(y(i)) • Y(t) = heating at time t 
q(t) if g(t) > 
if q(t) < 

if q{t) > 
-q(t) ifg(*)<0. 



(iii) If T is a path, 



Q + (r) = J^q + dt= heat gained along T 

Q~(r) = J^q~dt= heat lost along T . 

If T is a cycle, 

W(r) = Q(r) = Q + (r) - Q~(r) = work performed along T. 

(iv) 

t + (T) = {t G [a, 6] | exists, > 0} = times at which heat is gained 
t~(T) = {t G [a, b] | exists, q(t) < 0} = times at which heat is emitted. 



(v) 



T+(r) = su P {T(F(t)) 1 1 e t + (T) u r(r)} 

= maximum temperature at which heat is absorbed or emitted 

T-(r) = mm{T(Y(t)) | t G t+(T) U r(r)} 

= minimum temperature at which heat is absorbed or emitted. 
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Remark. 

Q+(r) = Q-(f) 

Q-(r) = Q+(f) 

Q(r) = -Q(f) 

T+(r) = T+(f) 

T-(T) = T-(f). 

Terminology: A path T is called 



□ 



Abbreviations 

(a) adiabatic if t + (T) , t~ (T) = 0. A 

(b) absorbing and essentially isothermal if t~(T) = 0, t + (r) 7^ and M + 
T(F(t)) is constant on t+(T). 

(c) emitting and essentially isothermal if t + (T) = 0, i~(r) 7^ and M~ 
T(y(t)) is constant on t~(T). 

(d) monotonic and essentially isothermal if V is one of types (a), (b), M 
(c) above. 

(e) essentially isothermal if T(Y(t)) is constant on t + (T) U £~(r). 7 

(f) a Carnot path if there exist 7\ < T 2 such that C 

r(y(t)) = Tiifter(r) 
T(Y(t)) = r 2 iftet+(r). 



□ 



Notation. If a, (3 e £, 



P(a,{3) = collection of all paths from a to /? in E 
A(a, /3) = collection of all adiabatic paths from a to /3. 

M ± (a, M(a, /3), I (a, /3), C(a, (3) are similarly defined. □ 

b. Existence of S 

Main Hypothesis. We assume that for each pair of states a, f3 G £ and for each tempera- 
ture level # attained in E, there exists a monotone, essentially isothermal path Y e M(a,/3) 
such that 

(1) r(y(t)) = 0on* + (r)ur(r). 
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Theorem. Assume 

(2) w(r) < (1 - Q+(r) 



for each cycle in E and 

wm =m- 

T+(r) 



(3) TO=|l-SI)Qt) 



/or eac/i Carnot cycle in E. 

T/ien i/iere exzsfo a C 2 function 

such that 



S : E 



f)Q A fe 

(4) ^ = T~ m E(* = l,2,...,n). 

This result says that the efficiency estimates (2), (3) (special cases of which we have 
earlier encountered) in fact imply the existence of the entropy S. Conversely it is easy to 
check using the arguments in §A.3 that (4) implies (2), (3). 



Proof. 1. Define the crude entropy change 



(5) m :- 



if T is adiabatic 



®un_Qzm ifnot 

r+(r) T~(T) 
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Then (2), (3) say: 
(6) 

2. Claim # 1. For each a, (3 G S: 

(7) 



£(r) < for each cycle T, 
£(T) = for each Carnot cycle T. 



M(a, (3) = A(a, (3) or M(a, (3) = M+(a, (3) 
or M(a,(3) = M~(a,(3). 



To prove this, fix a, (3 G S, ri,r 2 G M(a,j3). Assume first T\ G M + (a,[3), but r 2 ^ 
M+(a,/3). Then T 2 G A(a, (3) U M"(a,(3), and so Q+(T 2 ) = Q (T^ = 0. Hence Q"(f 2 * 
r\) = Q-(f 2 ) + Q"(ri) = Q + (r 2 ) + Q (T^ = 0, but f 2 * r\ is not adiabatic. Thus 



e(f 2 *ri) 



Q+(f 2 *ri) 
T+(f 2 *ri) 
Q+(f 2 )+Q+(ri) 

T+(f 2 *ri) 
Q-(r 2 )+Q+(ri) n 
r+(f 2 *ri) ^ u - 

Since T 2 * I\ is a cycle, we have a contradiction to (6). Hence r\ G M + (a,/3) implies 
r 2 G M + (a,(3). Likewise I\ G M"(a,/5) implies T 2 G M _ (o;,/3). This proves (7). 
3. Claim # 2. If r\, T 2 G M(a,(3), then 

(8) e(ri) = e(r 2 ). 

According to Claim # 1, r\,r 2 G A(a,/3) or r\,r 2 G M+(a,/3) or r\,r 2 G M"(a,/3). 
The first possibility immediately gives (8). Suppose now r l5 r 2 G M + (a,(3), with 
Q + (Ti) > Q + (r 2 ). Then r = f 2 * r\ is cyclic, with 

f T+(r) = T+(ri) 
r-(r) = T+(r 2 ) 
Q+(r) = Q+(r 1 ) 
{ Q-(r) = Q+(r 2 ). 



Thus (6) implies 



n - *rn Q+(r) _ 

u — ^l 1 J — T+(r) r-(r) 

Q+(ri) _ Q+(r 2 ) 
T+(ri) T+(r 2 ) 

= e(r0-e(r 2 ). 

This is (8) and a similar proof is valid if r 1; T 2 G M~(a, (3). 
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4. Claim # 3. If A G P(a,/3) and T G M(a,/3), then 
(9) e(A) < ffl 

To prove (9), recall the possibilities listed in (7). If M(a, (3) = A(a, (3), then T is adiabatic 
and f (r) = 0. Also f * A is a cycle, and f (f * A) = f (A). Thus (6) implies 

e(A) = e(f *A)<o = e(n. 

This is (9). 

If M(a,P) = M + (a,P), then A(a,/3) = M - (a,/3) = and so A is not adiabatic. Thus 
T~(A) is defined. According to the Main Hypothesis, there exists II G M{ot,f3) = M + (a 1 f3) 
such that 

T + (n) = T _ (A). 
Set S = fl * A. Then S is a cycle, with 

f T+(S) = T+(A) 

T-(S) = T+(n) = T-(A) 
Q+(~)=Q+(A) 
Q-(S)=Q-(A) + Q + (n) 



Thus (6) says 



> £(S) 



-(E) Q-(5) 



- T+(a) T - (s) 

Q+(A) _ Q-(A) _ Q+(n) 
T+(A) T-(A) T+(n) 

= e(A)-e(n). 



Then 



£(A)<£(n) = £(r), 

where we employed Claim # 3 for the last equality. The proof of (9) if M(ot, (3) = M~(a, (3) 
is similar. 

5. Claim #4. If A G P(a,(3) and T G J(a,/3), then 

(io) £(A) < e(r). 

To verify (10), select any n G M{a,(3). Then fl G M(/3,a), f G P(/3,a), and so 

e(f ) < e(n) 

according to Claim # 3. Since IT, T G I(a,(3), 

C(f) = -e(r), £(n) = -£(n). 
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Thus 

m < m- 

Owing to Claim # 3, then, 

e(A) < m < an 

This is (10). 

6. Claim # 5. There is a function : £ — > K. such that 

(11) £(r) < 0(/3) - 0(a) 

for all a,/3 G E, T G P(a,/3). 

To prove this, note first that Claim # 4 implies 

e(Ax)=e(A 2 ) if A!,A 2 el(a,(3). 

Thus we can define 

7r(a,/3):=£(A) (A G I(a,@)). 
Then according to Claim # 4 

e(r)<7r(a,/3) (reP(a,/3)), 

and so to derive (11) we must show that we can write 

(12) n(a, (3) = 4>{(3) - 0(a) for all a,j8eS. 

For this fix a state 76 E and a temperature level 0. Owing to the Main Hypothesis, there 
exist r\ G M(a,(3), T 2 G M(/3, 7 ) such that 

r ± (ri) = e on *+(ri) ur(ri) 
^ T±(r 2 ) = on*+(r 2 )ur(r 2 ). 

Then r 2 * r\ G I{a, 7) and 

t/T ±TM - Q + (r 2 *ri) Q+(r 2 *ri) 
^ i2 * il i ~ T+(r 2 *ri) T-(r 2 *ri) 

Q + (r 2 )+Q+(r!) _ Q-(r 2 )+Q-(rx) 

9 

= £(r 2 ) + £(ri). 

Hence 

(13) 7r(a, 7 )=7r(a,/3)+7r(/3, 7 ). 
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Define 

0(a) : = _ 7r ( Q!j7 ) j 

to deduce (12) from (13). 

7. Finally we build the entropy function S. Take any cycle r in E, T parameterized by 

{Y(t) | < t < 1}. 

Fix £ > 0. Then take N so large that 

1 1 

if *i,*2 e = 1, • • • , iV). Thus we have 

1 1 



< £ 



(14) 



< 



T-(r fc ) T+(r, 



<£ (k = l,...,N), 



where r& is parameterized by {Y(t) | < t < We here and afterwards assume each 
Tfc is not adiabatic, as the relevant estimates are trivial for any adiabatic r&. Thus 



Jr t 



i i 



T(Y(t)) 



A(Y(t))-Y(t)dt 

V v ' 

<z(*) 



Now 

by Claim # 5, with 



v w Q + ( r fc) _ ^~( r fc) 

- ^*=l T-(T fc ) T+(r fc ) 

< Er=iQ + (r fe )(^ + ^) + (-^y + ^)Q-(r fc ) 

= ELe(r fc ) + £(Q + (r fc ) + Q-(r fc )) 
= £Le(r fc ) + e/ol?(t)|df 

< EL «r fc ) + C£. 

C(r fe ) < 0(,9 fc ) - 0(a fc ) 



Pk = Y (£) = afc+1 . 
Since T is a cycle, ctAr+i = cti, and so 

n 

££(r fe )<o. 

fc=i 
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Consequently the calculation above forces: 

HY(t)) 



r Ay t ■ 
7o TO F(t)rft - 



-dt < 0. 



□ 



and thus 

fdQ f 1 q(t) 
J r T J T(Y(t)Y 

Applying the same reasoning to T in place of T, we conclude 

-Q- = for each cycle T. 
As E is simply connected, this fact implies the existence of S : S — > R with 

T 

2. Entropy, temperature and supporting hyperplanes 

Modern rigorous approaches to thermodynamics vastly extend the realm of applicability 
of the foregoing notions to extremely diverse systems of various sorts. See for instance Serrin 
[S2], [S3]. As an interesting second illustration of a modern theory we next present Feinberg 
and Lavine's derivation [F-Ll] of entropy and temperature for an abstract thermal system, 
as a consequence of the Hahn-Banach Theorem. 

a. Definitions 

Notation. 

(i) S = a compact metric space. 

(ii) (7(E) = space of continuous functions : E — > R, with 

U\\c(S) = max |0(a) I . 

(iii) .M(E) = space of signed Radon measures on E. 
.M + (E) = space of nonnegative Radon measures. 
M (£) = {z/g7W(E) I z/(E) =0}. 
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(iv) We endow .M(E) with the weak* topology, which is the weakest topology for which 
the mappings 




are continuous on *M(E) for all G C(E). 
(v) M (E) x M(E) = {(v,n) | !/ g Mo(E), n G .M(E)}. 

We give .M(E) the weak* topology, .M (E) the inherited subspace topology and .Mo(E) x 
.M(S) the product topology. □ 

A model for an abstract thermodynamic system 

We are given 

(a) a compact metric space E, as above. (E is the state space and elements of E are 
called states). 

and 

(b) a nonempty set V C -M (E) xM(S), such that 

(15) V is a closed, convex cone. 
(Elements of V are called processes. If Y — (v, /i) G V is a process, then 

f G A^o(E) is the change of condition 

and 

/i G A4(E) is the heating measure 

for r.) 

Definitions, (i) A cyclic process for our system is a process whose change of condition is 
the zero measure. 

(ii) Let C denote the set of measures which are heating measures for cyclic processes. 
That is, 

(16) C = {^G7W(E) I (0,/i) eV}. 

Physical interpretation. The abstract apparatus above is meant to model the following 
situation. 

(a) Suppose we have a physical body U made of various material points. At any given 
moment of time, we assign to each point x G U a state a(x) G E. 
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physical body U abstract state space L 

The condition of the body at this fixed time is then defined to be the measure p G Ai + (J2) 
such that 

p(E) = mass of o-^ 1 (E) 

for each Borel subset £cS. 

(b) We now image a process which somehow acts on U, thereby changing the initial 
condition pi to a final condition pf. The change of condition is 

v = Pf~ Pi- 

Observe 

i/(E) = pfffl-fHp) 

= mass of U— mass of U = 0. 

Thus v G A^o(S). If the process is cyclic, then pf = pf, that is, v — 0. 

(c) We record also the heat received during the process by defining 

p(E) = net amount of heat received from the 

exterior during the entire process, by the 
material points with initial states lying in E 

for each Borel subset £cS. The signed measure p is the heating measure. □ 

b. Second Law 

We next assume that our abstract system satisfies a form of the Second Law, which for 
us means 

(17) Cn7W+(S) = {0}. 
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Physical interpretation. Condition (17) is a mathematical interpretation of the Kelvin- 
Planck statement of the Second Law, namely 



(18) 



"if, while suffering a cyclic process, a body absorbs 
heat from its exterior, that body must also emit heat 
to its exterior during the process" . 



This is quoted from [F-Ll, p. 223]. In other words, the heat supplied to the body undergoing 
the cyclic process cannot be converted entirely into work: there must be emission of heat 
from the body as well. The cyclic process cannot operate with perfect efficiency 

Our condition (17) is a mathematical interpretation of all this: the heating measure \i 
for a cyclic process cannot be a nonnegative measure. 

c. The Hahn— Banach Theorem 

We will need a form of the 

Hahn— Banach Theorem. Let X be a Hausdorff, locally convex topological vector space. 
Suppose Ki,K 2 are two nonempty, disjoint, closed convex subsets of X and that K 1 is a 



cone. 



Then there exists a continuous linear function 



(19) 



$ : X 



R 



such that 



(20) 




$(fci) < for all ki G K x 
$(fc 2 ) > for all k 2 G K 2 . 



We can think of the set {$ = 7} as a separating hyperplane between K ± , K 2 . 
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{<&>0} 



To utilize the Hahn-Banach Theorem, we will need an explicit characterization of $ 
certain circumstances: 

Lemma (i) Let X = and suppose 

(21) $ : X -> R 

is continuous and linear. Then there exists <fi e C(E) snc/i i/iot 

(22) $(//) = y for all fi EX. 

(ii) Le£ X = .Mo(E) x jW(S) and suppose 

$ : X -> R 

continuous and linear. Then there exist (f>,ip E C(E) s«c/i £/iai 

(23) $((*/,//)) = / + / 
for all (z/, /i) G X. 

Proof. 1. Suppose X = .M(E) and 

$ : X -> R 
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is continuous, linear. Fix any point a G £ and let 5 a denote the point mass at a. Next 
define 

0(a) := $(<y (a G £). 

2. We first claim 

G C(E). 

To prove this, let a k — > a in £. Then for every -0 G C(E), 



and so 

This means 

As $ is continuous, 

Thus is continuous. 
3. Since $ is linear, 

(24) 



^d5 ak = ip{a k ) -> V'(a) = / 



<^a fc ~^ ^« weakly as measures. 
S ak -> 5 a in .M(E). 
0(a fc ) = $(5 a J -> $(£ a ) = 0(a). 



$ ^2a k 5 ak = ^a fc 0(a fc ) 



fc=l 



for all {a / t}^ =1 C E, {cik}™^ G KL Finally take any measure ji G .M(E). We can find 
measures {^m\m=i of the form 



^ m = ^2a k n 5 < (m = l,...) 



k=i 



such that 
(25) 

Then (24) implies 



fi m -> /i in .M(E). 



^a^0(a^) = $(/i m ) -> $(//) asm^oo. 
fc=i 

But since is continuous, (25) says 

» TO 

/ 0^ m = 5>r0«)- 
Js k=l 



(fidfi as m — > oo. 
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Thus 

This proves the representation formula (22) and the proof of (23) is then immediate. □ 
d. Existence of S,T 

Theorem Assume that our abstract thermodynamic system satisfies (17). Then there exist 
continuous functions 

S : E -> R 

T : E -> (0,oo) 



< 26 » 



< / Sdz/ 
s 



/or eac/i process F = (u, p) G P. 

We will later interpret (26) as a form of Clausius' inequality. 
Proof. 1. Hereafter set X = M (Z) x .M(E), 

^ K 2 = {0} x A^(E), 
where 

A^(E) = {p G M + (E) | p(E) = 1}. 

By hypothesis (15), K x is a closed, convex cone in X. K 2 is also closed, convex in X. In 
addition our form of the Second Law (17) implies that 

K x n K 2 = 0. 

We may therefore involve the Hahn-Banach Theorem, in the formulation (20), (23): there 
exist 0, ip G C(E) with 

(27) p) = / ^ + / < for all T = (u, fi) E V 

Jt, Jt, 

and 

(28) $(0, p) = y 0dp > for all p G -M+(E). 
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Taking p = 5 a in (28), where a G E, we deduce that 

> on E. 

Change notation by writing 



Then (27) reads 



S := -V, T := I 



< / 5di/ 



for all processes T = (v, /i) G P. □ 

Physical interpretation. We can repackage inequality (26) into more familiar form by 
first of all defining 



(29) 



dQ f dfj, 
~r~ := L Y 



for each process V = Then (26) says 

(30) / -Q- < T a cyclic process. 

Jr T 

Fix now any two states a,/3 G E. Then if there exists a process r = (6p — 5 a , fj) G V, (26) 
and (29) read 

(31) f ^ < - 5(a) T a process from a to f3. 
Jr T 

Clearly (30) is a kind of Clausius inequality for our abstract thermodynamic system. 

Finally let us say a process r = (6p — 5 a , fj,) G V is reversible if T := (5 a — 5/3, — /x) G P. 
Then 

= S(P) — ^(q;) T a reversible process from a to (3. 

Remark. See Serrin [S2], [S3], Coleman-Owen-Serrin [C-O-S], Owen [O] for another general 
approach based upon a different, precise mathematical interpretation of the Second Law. 
Note also the very interesting forthcoming book by Man and Serrin [M-S]. □ 
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CHAPTER 3: Continuum thermodynamics 



Since our primary subject in these notes is the application of entropy ideas to PDE theory, 
we must next confront a basic issue: the classical physical theory from Chapter I and from 
§A, B.l in Chapter II are concerned with entropy defined as an extensive parameter over an 
entire system. That is, S does not have spatial dependence. The rather abstract framework 
in §B.2 of Chapter II does on the other hand allow for variations of S over a material body, 
and this prospect points the way to other modern approaches to continuum thermodynamics, 
for which the entropy, internal energy, temperature, etc. are taken to be functions of both 
position and time. We will present here the axiomatic method of Coleman-Noll [C-N]. 

A. Kinematics 

1. Deformations 

We model the motion of a material body by introducing first a smooth bounded region 
U, the reference configuration, a typical point of which we denote by X. We suppose the 
moving body occupies the region U (t) at time t > 0, where for simplicity we take U(0) = U. 



Let us describe the motion by introducing a smooth mapping x '■ U x [0, oo) — > M 3 so that 



is the location at time t > of the material particle initially at X e U. We require that 
for each t > 0, : U — > U(t) is an orientation preserving diffeomorphism. Write 

i>{-,t) = X~\-,t); so that 




U(t) 



(1) 



x = x(X,T) 



(2) 



X =rjt(x,t). 
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Then 

(3) v(x,t) = ^(X,t) 

is the velocity field, where X, x are associated by (1), (2). 

2. Physical quantities 

We introduce as well: 
(i) the mass density 



ii) the stress tensor 

iii) the body force/ 'unit mass 

iv) the internal energy /unit mass 

v) the heat flux vector 

vi) the heat supply /unit mass 

vii) the entropy y 'unit mass 

viii) the local temperature 



p(x,t) 
T(x,t) 
b(x,t) 
e(x, t) 
q(x,t) 
r(x,t) 
s(x, t) 



e(x,t). 

The functions p, 9 are assumed to be positive. 

Remarks. (1) We henceforth assume these are all smooth functions. Note 

p, e, r, s and 9 are real valued 
b, q take values in R 3 

T takes values in § 3 (= space of 3 x 3 symmetric matrices) 

(2) We will sometimes think of these quantities as being functions of the material point 
X rather than the position x. In this case we will write 

p = p{X,t), T = T(X,t), etc. 



64 



where X, x are related by (1), (2). □ 
Notation. We will often write 

(4) dm = pdx. 

Note. The kinetic energy at time t > is 



K(t) = I 
Ju 



v 



u(t) 2 



-dm] 



the internal energy is 



E(t)= / edm; 
Ju(t) 



and the entropy is 



u(t) 

S(t) = / sdm. 
Ju(t) 

□ 

3. Kinematic formulas 

Take V to be any smooth subregion of U and write V(t) = x(V,t), (t > 0). 
If / = f(x,t) (x E 1R 3 , t > 0) is smooth, we compute: 

• I fdx] = [ %dx + [ /v • udS, 



dt \Jv(t) J Jv(t) 9t Jav(t) 

where u dS" denotes 2-dimensional surface measure on dV(t), v is the velocity field and u is 
the unit outer normal vector field to dV(t). Applying the Gauss-Green Theorem we deduce: 



< 5 > J t 

Take f — p above. Then 



( I fdx) = [ %+ dW(fv)dx. 
\JV(t) J Jvct) at 



X ((1 t +div( " v)<fa= l(/, (() ^) =o ' 

as the total mass within the regions V(t) moving with the deformation is unchanging in 
time. Since the region V(t) is arbitrary, we deduce 

dp 

(6) — + div(pv) = conservation of mass. 
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This PDE holds in U(t) at each time t > 0. Now again take / = f(x,t) and compute 

(lv(t) f dm ) = ft (lv(t) fP dx ) b y ( 4 ) 

= U T + div(/pv)^ by (5) 



dt 



iv(t) at 

= Jwlf + pvDfdxbyW, 



where 



Df = D f= {9L M. 9L) 

J XJ I Q X1 1 0x2'* 8X3 I 



gradient of / with 

respect to the spatial 
variables x = (xi,x 2 ,x 3 ) 



Recalling (4) we deduce 
(7) 

where 



d 
dt 



fdm 



v(t) 



v(t) 



2i 

Dt 



dm, 



Dt dt 

is the material derivative. Observe that 

d 



+ v-Df 

Jt f(x(x,t),t) = °l. 



4. Deformation gradient 

We define as well the deformation gradient 



(8) 

and the velocity gradient 
(9) 

for X e U, t > 0. 



F(X,t) = D xX (X,T) 



dx 3 



dx 1 \ 
8X 3 X 



dx 3 
dX 3 



J 



OF 

L(X,t) = —(X,t)F-\X,t) 
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To understand the name of L, recall v(x,t) = ^(X,t) = ^(if>(x,t),t). Thus 

d 2 x l dip k 



^77wT (1<«.J<3). 



^ dtdX k d Xj 

k=l J 



As F 1 = D x ip, we see that 



dF 

Dv(x,t) = —(X,t)F- 1 (X,t). 



Thus 

(10) L(X,t) = Dv(x,t) 
is indeed the velocity of the gradient. 

Remark. Let p(-, 0) = po denote the mass density at time t — 0. Then 

p(x,t) = (det D x iP(x,t))p (X) 
is the density at x E U(t), t > 0. Since F = (D x ^») -1 , we see that 

(11) P (x,t) = (detF(X,t))- 1 poW 
for t > 0, X, x associated by (1), (2). 

B. Conservation laws; Clausius Duhem inequality 

We now further assume for each moving subregion V(t) (t > 0) as above that we have: 
I. Balance of linear momentum 

(1) f vdrn) = [ bdm + / TudS. 

dt \Jv(t) J Jv(t) JdV(t) 

This says the rate of change of linear momentum within the moving region V(t) equals the 
body force acting on V(t) plus the contact force acting on dV(t). We employ (6), (7) in §A 
and the Gauss-Green Theorem to deduce from (1) that 

(2) P~n~ = P D + div T balance of momentum. 

J—J L 

We additionally suppose: 
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II. Energy balance 



(3) 



d 
dt 



( f ^~o~ e ^ m I = f v ' D + rdm 

\Jv(t) 2 / Jvu) 



+ / v • Tu - q • vdS 

'dV{t) 



This identity asserts that the rate of change of the total energy (= kinetic energy + internal 
energy (including potential energy)) within the moving region V(t) equals the rate of work 
performed by the body and contact forces, plus the body heat supplied minus the heat flux 
outward through dV(t). 

Since T is symmetric, v-Ti/ = (Tv) v. Hence (3) and the Gauss-Green Theorem imply: 



D f\v\ 2 \ 
P Dt 2 + e J = p(v • b + r) + div(Tv - q). 

Simplify, using (2) to deduce: 

De 

(4) p—— = pr — div q + T : Dv energy balance. 

Notation. If A, B are 3x3 matrices, we write 

3 

''•J- 



A: B = ^2 a ijh 



□ 



Lastly we hypothesize the 

III. Clausius-Duhem inequality 



(5) Uj sdm) > I "-dm- [ *fdS. 

at \JV(t) J JV(t) & JdV(t) V 

This asserts that the rate of entropy increase within the moving region V(t) is greater than 
or equal to the heat supply divided by temperature integrated over the body plus an entropy 
flux term integrated over dV(t). Observe that (5) is a kind of continuum version of the 
various forms of Clausius' inequality we have earlier encountered. 
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As before (5) implies 



D S TO / q\ 

(6) p—— > — — div I — ) entropy inequality. 

Dt 9 \9 / 

Notation. We define the local production of entropy per unit mass to be: 

7 



(7) 



Ds r 1 /q 

r div(q) q ■ D9 



> 0. 



□ 



We call a motion x an d a collection of functions p, T, b, etc. satisfying I III above an 
admissible thermodynamic process. 

C. Constitutive relations 

A particular material is defined by adding to the foregoing additional constitutive re- 
lations, which are restrictions on the functions T, b, etc. describing the thermodynamic 
process. 

1. Fluids 

Notation. We refer to 

(!) „ = I 

as the specific volume. Note that then 



\u(t)\ 



vdm = volume of U (t) . 



u(t) 



□ 



a. We call our body a perfect fluid with heat conduction if there exist four functions 
e, 9, T, q such that 



(2) 



(a) e = e(s, v) 

(b) 9 = 9(s,v) 

(c) T = T(s,v) 

I (d) q = q(s,v,D9). 
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These are the constitutive relations. 
Notation. Formula (a) means 

e(x, t) = e(s(x, t),v(x, t)) (x E U(t), t > 0) 

where 

e : R x R -> R. 

Equations (b)-(d) have similar interpretations. Below we will sometimes omit writing the " 
and so regard e as a function of (s, v), etc. □ 

The key question is this: 

What restrictions on the constitutive relations 
(2) are imposed by the Clausius-Duhem 
inequality? 



Ds De\ 1 
(3) < p#7 = p [ 9— — — ) + T : Dv — -q • DO. 



To deduce useful information, let us first combine (4), (6) from §B: 

Ds De\ m ^ 1 

1 + T : Dv 

Dt Dt J 6 

Owing to (2) (a) we have: 

De de Ds de Dv 
K ' ~D~i~ dsL)i dv~T)i' 

Now the conservation of mass (6) in §A implies 

Dp 



— p div v. 
Dt F 



Thus 



(5) — — = — ^-^^ = - div v = v div v. 
y ' Dt p 2 Dt p 

Insert (4), (5) into (3): 

W < p (i*\% + (f* I ) :Dv -b i -i». 



ds) Dt V dv J 9 

The main point is that (6) must hold for all admissible thermodynamic processes, and here 
is how we can build them. First we select any deformation x as m §A and any function s. 
Then v = F — D\X- P — (det F)~ 1 p . Thus we have v = p" 1 and so can then employ 



70 



(2) to compute e, 9, T, q. Lastly the balance of momentum and energy equations ((2), (4) 
from §B) allow us to define b, r. 

We as follows exploit this freedom in choosing x, s. First fix any time to > 0. Then choose 
X as above so that p(-,t ) = (det F(-, t ))~Vo = p is constant, and s so that s(-,t ) = s 
is constant. Then v(-,t ) is constant, whence (2)(b) implies D9(-,t ) = 0. Fix any point 
x E U(t ). We can choose s as above, so that D\ = L = ^F' 1 and j=£ at (x ,t ) are 
arbitrary. As (6) implies 

( K de\ Ds ( ~ de \ ^ 

at (xo, to) f° r ah choices of Dv, we conclude 9 = T = |^7. We rewrite these formulas, 
dropping the circumflex: 

(7) — = temperature formula 
and 

(8) T = -p/, 
for 

<9e 

(9) — = — p pressure formula. 

ov 

Equation (9) is the definition of the pressure p = p(s,v). Thus in (7)-(9) we regard e, T,p 
as functions of (s,v). 

Remarks, (i) The balance of momentum equation (for b = 0) becomes the compressible 
Euler equations 

(10) = -Dp, 

about which more later. If the flow is incompressible, then p is constant (say p = 1) and 
thus (6) in §A implies div v = 0. We obtain then the incompressible Euler equations 



;n) 



div v = 0. 



(ii) Note the similarity of (7), (9) with the classical formulas 

BE m dE 

— = T — = -P 

OS ' dV 
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for a simple fluid in Chapter I. □ 
Returning to (6), we conclude from (7)-(9) that 

q(s,v,D9) ■ DO < 0. 

As s,v, and DO are arbitrary, provided D9(s,v) ^ 0, we deduce: 

(12) q(s,v,p) -p < heat conduction inequality 
for all s, v and all p G M™. Since the mapping 

p ^ q(s,v,p) -p 
has a maximum at p — 0, we see as well that 

q(s,v,0) =0, 

which means there is no heat flow without a temperature gradient. 

b. We call our body a viscous fluid with heat conduction if there exist five functions 
e, 9, T, /, q such that 

( (a) e = e(s,v) 

(13) J (b) 6 = § ( 8 > v ) 

I (c) T = T(x,v) + l(s,v)[Dv] 

, (d) q = q(s,v,D9). 

Here for each (s,v), we assume Z(s,f)[-] is a linear mapping from M 3x3 into S 3 . This term 
models the fluid viscosity. 

As before, we conclude from the Clausius-Duhem inequality that 

o ^(^£)^ + ( # -^) :flv+i ' [Dv|:flv -^- m 

Since this estimate must be true for all x an d s, we deduce as before the temperature formula 
(7). 

We then pick s appropriately to deduce: 

< (f - : Dv + l[Dv] : Dv. 

In this expression Dv is arbitrary, and so in fact 



< A (f - |^7^ : Dv + A 2 /[Dv] : Dv 
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for each A G R. If we define p by (9), the identity 

(14) T = —pi + l[Dv] 
follows, as does the inequality 

(15) l(L) : L > dissipation inequality 
for all L E M 3x3 . The heat conduction inequality (12) then results. 

Remark. As explained in [C-N] constitutive relations must also satisfy the principle of 
material objectivity, which means that an admissible process must remain so after a (possibly 
time dependent) orthogonal change of frame. This principle turns out to imply that /[•] must 
have the form 

(16) l[Dv] = /x(/Jv + (/Jv) T ) + A(div v)I 

where /i, A are scalar functions of (s,v). The dissipative inequality (15) turns out to force 

H > 0, A + \n > 0. 

Taking /i, A to be constant in the balance of motion equations (for b = 0) gives the com- 
pressible Navier-Stokes equations 

(17) p— = - Dp + p J Av + (X + p J )D (div v). 

If the flow is incompressible, then p is constant (say p = 1) and so the conservation of 
momentum equation implies div v = 0. We thereby derive the incompressible Navier-Stokes 
equations 

(18) \ div v = 0. 

□ 

2. Elastic materials 

We can also model as well elastic solids, for which it is appropriate to display certain 
quantities in Lagrangian coordinates (i.e., in terms of X, not x). 

a. We call our body a perfect elastic material with heat conduction provided there exist 
four functions e, 9, T, q such that 

(a) e = e(s, F) 

(b) 6 = 6(s,F) 

(c) T = f (s,F) 
I (d) q = q(s,F,D6). 



(19) 
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These are the constitutive relations. 
Notation. Equation (a) means 

e( x ,t) = e(s(x,t),F(X,t)) 



(x e U(t), t > 0) 



where e : M 3x3 x 



and X = ij)(x, t). Equations (b)-(d) have similar interpretations. 



Recall from §A that F = D X X is the deformation gradient. 
According to (3): 



□ 



(20) 



. n Ds De\ 
' Dt Dt J 



)+T:Dv--q-D9. 



Owing to (19) (a) we have 



De 
Dt 



de Ds de 
dsDt + d~F 



dF 



+ D X F 



dt ' ~~ Dt 

Differentiate the identity tp(x(X,t),t) = X with respect to t, to deduce ^ = 0. So 



De _ de Ds de dF 
Dt ~ dsDt + dF'~dt' 

Recalling from (9), (10) in §A that Dw = L = %F~\ we substitute into (20), thereby 
deducing 



(21) 



< p [6 



de\ Ds 



- 



ds J Dt 



+ T 



de n 
»8F F 



: Dv - -q • D9. 

u 



Fix to > 0. We take x so that Dxx(',to) — D(-,t ) = F, an arbitrary matrix. Next we pick 
s so that s(-, 0) = s is constant. Thus (19) (b) forces D6(-, 0) = 0. As Dv = L = ^F~ l and 
can take any values at t — t , we deduce from (21) the temperature formula (7), as well 
as the identity 



(22) 



T 



de n 
P 8F F 



stress formula, 



where, we recall from (11) in §A, p = (det F) _1 po- Here we have again dropped the circum- 
flex, as so are regarding T, e as functions of (s, F). Next the heat conduction inequality (12) 
follows as usual. 



2 In coordinates: (^f)^- 



dF u 



(1<M<3). 
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Note. Let us check that (22) reduces to (8), (9) in the case of a fluid, i.e. e = e(s,v), 
v = p- 1 = (det F)p \ Then the (ij)-th. entry of f is 

E" de p de ^ d(det F) 

dF~k jk = 7od^^ d~F~ k jk ' 
k=i %K ' u k=i %K 

Now 

gtjet F) . 
^^ = (COfF) "=' 
cof F denoting the cofactor matrix of F. Thus 



r 


e = 


e(s,F) 


(b) 


e = 




(c) 


T = 


--T(s,F) + l(s,F)[Dv 


I (d) 


q = 


q(s,F,D9). 



T = P§F T = iM™ iF ) FT 
= (det F)L 

po CIV v / 

As p = (det F) _1 p , we conclude T = |^J, and this identity is equivalent to (8), (9). □ 

b. We have a viscous elastic material with heat conduction if these constitutive relations 
hold: 



(23) 



We deduce as above the temperature formula (7), the heat conduction inequality (12), the 
dissipation inequality (15), and the identity 

T = p^F< r + l[Dv]. 

Remark. Our constitutive rules (2), (13), (19), (23) actually violate the general principle of 
equipresence, according to which a quantity present in an independent variable in one consti- 
tutive equation should be present in all, unless its presence contradicts some rule of physics 
or an invariance rule. See Truesdell-Noll [T-N, p. 359-363] for an analysis incorporating this 
principle. 

D. Workless dissipation 

Our model from §11. A of a homogeneous fluid body without dissipation illustrated an 
idealized physical situation in which work, but not dissipation, can occur. This is dissipa- 
tionless work. We now provide a mathematical framework for workless dissipation (from 
Gurtin [GU, Chapter 14]). 
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We adopt the terminology from §A-C above, with the additional proviso that our material 
body is now assumed to be rigid. So U(t) — U (for all t > 0) and 

(1) v = 0, b = 0. 

We simplify further by supposing the mass density is constant: 

(2) p = 1. 

The remaining relevant physical quantities are thus e, q, r, s and 9. 

Under assumption (1) the momentum balance equation is trivial. The energy balance 
equation ((4) in §B) now reads: 

(3) lft =r ~ div q 
and the entropy flux inequality ((6) in §B) becomes: 

/ . \ ds r , . / q 

The local production of entropy is 

ds_r_ div(q) _ q^D9 
{) 1 dt 9 9 9 2 ~ 

Combining (3)-(5) as before, we deduce: 

n f n ds de\ q-D9 
(6) < 7^ = [9 1 



dt dt J 9 
It is convenient to introduce the free energy / 'unit mass: 
(7) f = e-9s, 

a relation reminiscent of the formula F = E — TS from Chapter 1. In terms of /, (6) 
becomes: 

Of 09 q-D9 

For our model of heat conduction in a rigid body we introduce next the constitutive relations 

(a) e = e(9,D9) 

(9) { (b) s = s(9,D9) 

(c) q = q(9,D9) 
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where e, s, q are given functions. We seek to determine what (8) implies about these struc- 
tural functions. 
First, define 

(10) f(8,p):=e(8,p)-8s(8,p); 



so that (7) says 
Therefore 

Plug into (8): 



f = f(e,De). 
dt dddt + pJ \dt) 



df A 98 „ f „ (88\ q-D8 

As before, we can select 9 so that || , D (||) and -D6* are arbitrary at any given point (rr, t). 
Consequently we deduce 

df 

(12) — = — s free energy formula, 

do 

an analogue of the classical relation 

d_F_ 
dT~ b 

discussed in Chapter I. Also we conclude 

(13) D p f = 0, 
and so (10) implies 

D p e(8,p) = 9D p s(8,p) 
for all 8, p. But (12), (13) allow us to deduce 

= ^(Dj) = -D p s. 
Hence D p e = D p s = 0, and so e, s do not depend on p. Thus (9) (a), (b) become 

, e = e(0) 

(14) < ,=.w. 
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The energy and entropy thus depend only on 9 and not DO. Finally we conclude from (11) 
that 

(15) - p < heat conduction inequality 
for all 9, p. The free energy is 

f = f(9)=e(9)-9s(9). 
Finally we define the heat capacity / 'unit mass: 

(16) c v (9) = e'(9), (' = £), 
in analogy with the classical formula 

Let us compute f' = e' — s — 9s'; whence (12), (16) imply 

J ~ S ~ 9- 

In summary 

(17) c v (9)=0s'(9) = -9f(9)". 

In particular / is a strictly concave function of 9 if and only if c v (-) > 0. 
Finally we derive from (3), (16) the general heat conduction equation 

f)9 

(18) C ^di + M<l{0,D6)) = r. 

This is a PDE for the temperature 9 = 9(x,t) (x e U, t > 0). The local production of 
entropy is 

Remark. The special case that 

(20) q(9,p) = -Ap 

is called Fourier's Law, where A G M 3x3 , A = ((a^)). Owing to the heat conduction 
inequality, we must then have 

P ■ (Ap) > for all p, 

and so 

3 

(21) Yl :> ii( r3 )- 

□ 
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CHAPTER 4: Elliptic and parabolic equations 



In this chapter we will analyze linear PDE of elliptic and parabolic type in light of the 
physical theory set forth in Chapters I III above. We will mostly follow Day [D]. 

Generic PDE notation. 

U,V,W = open subsets, usually of M, n 
u,v,w = typical functions 

y. — §U _ du _ d 2 U t 

tH — dt , u Xi — gxi , u XiXj — dx . dx ., eic. 

□ 

A. Entropy and elliptic equations 
1. Definitions 

We will first study the linear PDE 

n 

(1) ^ (,/"(./•),/ ,, ) /inC 

where U C 1™ is a bounded, connected open set with smooth boundary dU ', 

f :U — > R, 

and 

A : U -> § n , A = ((a ij )). 

The unknown is 

u = u(x) (x G U). 
We assume u > in U and u is smooth. 

Physical interpretation. We henceforth regard (1) as a time-independent heat conduction 
PDE, having the form of (18) from §111. D, with 

u = temperature 

(2) <( q = -ADu = heat flux 

/ = heat supply/unit mass. 
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We will additionally assume in the heat condition PDE that the heat capacity c v is a constant, 
say c v = I. Then, up to additive constants, we see from formula (17) in §111. D that 



(3) 

The local production of entropy is 
(4) 7=E 



u = internal energy/unit mass 
log if = entropy/unit mass. 



n 

IJ x » X J 

»J=1 



We will henceforth assume the PDE (1) is uniformly elliptic, which means that there 
exists a positive constant 6 such that 

n 

(5) J2 «°'(*)&& ^i 2 

for all x G U, £ G M. n . We assume as well that the coefficient functions {a^}^ =l are smooth. 
Note that (5) implies 7 > 0. 

Notation, (i) If V is any smooth region, the outer unit normal vector field is denoted 
v = (ui, . . . , u n ). The outer A-normal field is 

(6) v A = Av. 

(ii) If u : V — > K. is smooth, the A-normal derivative of w on 9V is 

(7) ^- = Du.(Au) = J2^u lUxj . 

□ 

Note. According to (5) Va • v > on <9V and so v A is an outward pointing vector field. 

□ 

Definitions. Let V C U be any smooth region. We define 

r f 

(8) ^(K) = / ~~ dx = entropy supply to V 

Jv u 



(9) 



= / idx = / y, 2 ^ = i n t erna l entropy production 

Jv Jv iJ=1 u 
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f 1 du 

(10) R(V) — — / — - — dS — entropy flux outward across dV. 

J dv udv A 

Lemma For each subregion V C U , we have: 

(11) F(V)+G(V) = R(V). 

This is the entropy balance equation. 
Proof. Divide the PDE (1) by u and rewrite: 



n n 



Integrate over V and employ the Gauss-Green Theorem: 



n 

lev u " — i ./v • — M 



R(V) G(V) 

+ / — dx . 
Jv u 

F(V 



□ 



We henceforth assume 

(12) / > in U, 
meaning physically that only heating is occurring. 

2. Estimates of equilibrium entropy production 

The PDE (1) implies for each region V C U that 

(13) f q vdS= [ fdx 

JdV JV 

for q = —ADu. This equality says that the heat flux outward through dV balances the heat 
production within V . Likewise the identity (11) says 

(14) f ^-udS= [ 1 + -dx, 

Jv u Jv u 
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which means the entropy flux outward through dV balances the entropy production within 
V, the later consisting of 



f v "fdx, the rate of internal entropy generation, and 
j v ^dx, the entropy supply. 

a. A capacity estimate 

Now (13) shows clearly that the heat flux outward through dU can be made arbitrarily 
large if we take / large enough. In contrast if V CC U, there is a bound on the size of the 
entropy flux and the rate of internal entropy production, valid no matter how large / is. 




Assume that V CC U and let w solve the boundary value problem 

(15) I w = 1 ondV 

[ w = on dU. 

Definition. We define the capacity of V (with respect to dU and the matrix A) to be 

„ n 

(16) C&p A (V,dU) = / J2a ij w Xi w Xj dx, 

Ju-v id=1 



w solving (15). 

Integrating by parts shows: 



(17) Cap A (V,dU)= f -^-dS. 



lav du A 
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v A denoting the outer A-normal vector field to V. 
Theorem 1 We have 

(18) R{V) < Ca^ A {V,dU) 

for all choices of f > 0. 

Proof. Take w as in (15) and compute in U — V: 

En 

+ 2^a ij u x .w x ] 
= _vP±_Y n . a ij (-u -w ) (-u -w ) 

where we employed the PDE (1). Since / > and the elliptivity condition (5) holds, we 
deduce 



Xj 

— (a lj u x .) x . — K-a lj u x .u x . 



n / 2 \ n 

( —a ij u Xi J < a ij w Xi w Xj in 17 - V". 



Integrate over U — V: 

w 2 du 



[ — - — dS < f a^w x .w x dx. 
Jav u dv A ~ J v _ v ^ x * x > 



Since w — 1 on V, the term on the left is R(V), whereas by the definition (16) the term on 
the right is Cap A (V, dU). □ 

Note that the entropy balance equation and (18) imply: 

(19) G(V) < Cnp A (V,dU) 

for all /, the term on the right depending solely on A and the geometry of V, U . The above 
calculation is from Day [D]. 

b. A pointwise bound 

Next we demonstrate that in fact we have an internal pointwise bound on 7 within any 
region V CC U, provided 

(20) / = in U. 
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Theorem 2 Assume f = in U and V CC U. Then there exists a constant C, depending 
only on dist(V, dU) and the coefficients, such that 

(21) sup7<C 

v 

for all positive solutions of (1). 

In physical terms, if the heat supply is zero, we can estimate the local production of 
entropy pointwise within V, completely irrespective of the boundary conditions for the tem- 
perature on dU . 

The following calculation is technically difficult, but — as we will see later in §VIII.D — is 
important in other contexts as well. 

Proof. 1. Let 

(22) v = \ogu 
denote the entropy. Then the PDE 

n 

-J2(a*t Uxi ) x .=0mU 

becomes 



(23) - (a ij v Xi ) Xj = a ij v Xi v Xj = 7. 



i,j=l i,j=l 



So 



(24) - aij v* iXj + E // r " = 7 m U 

i,j=l i=l 

for 

n 

Differentiate (24) with respect to x k : 

n n 

(25) - E aVv x k xiXj + E b%v *w = + Rl > 

i,j=l i=l 

where i?i denotes an expression satisfying the estimate 

(26) \R l \<C{\D 2 v\ + \Dv\). 
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2. Now 



Ekl 
a v Xk v Xl . 

k,l=l 



Thus 
(27) 



Ixi = EL=i 2a kl v XkXt v Xl + a k i, Xi v Xk v. 



Xll 

IxiXj — Y2k,l=l ^ a v x k XiXj V xi + 2a V XkXi V XlXj + i?2, 

where 

(28) \R 2 \ < C(\D 2 v\ \Dv\ + \Dv\ 2 ). 
Hence 

(29) = 2 Em =1 a fcZ ^ ; (- E«=i + E?=i 6*^) 

-2 £" j)M =i a ij a kl v XkX .v XlX . + R 3 , 

i?3 another remainder term satisfying an estimate like (28). In view of the uniform ellipticity 
condition (5), we have 

n 

Y <>'"«">■•,•■ J' nr, > e 2 \D 2 v\ 2 . 

i,j,k,l=l 

This estimate and the equality (25) allow us to deduce from (29) that 

n n n 

(30) - Y ^Ixw + Y b ^ ^ 2 H aklv *n*k - 20 2 \D 2 v\ 2 + R 4 , 

i,j=l i=l k,l=l 

i?4 satisfying an estimate like (28): 

\R 4 \ < C(\D 2 v\ \Dv\ + \Dv\ 2 ). 
Recall now Cauchy's inequality with e 

ab<ea 2 + -^b 2 (a,b,e>0), 

and further note 

7 > 6\Dv\ 2 . 

Thus 

\R 4 \ < 9 2 \D 2 v\ 2 + C 7 . 
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Consequently (30) implies 

n 

(31) e 2 \D 2 v\ 2 -Y J * ij l* i * j <C{l + l ll2 )\D 1 \+C 1 . 

Next observe that the PDE (24) implies 

7 < C(\D 2 v\ + \Dv\) 

< C(\D 2 v\+ 1 1 / 2 ) 

< C\D 2 v\ + C+ I, 

where we again utilized Cauchy's inequality. Thus 

(32) 1 <C(\D 2 v\ + l). 
This estimate incorporated into (31) yields: 

n 

(33) a 7 2 - #1**, < C(l + 7 1/2 )l^7l + C 

»,j=i 

for some a > 0. 

3. We have managed to show that 7 satisfies within U the differential inequality (33), 
where the positive constants C, a depend only on the coefficients. Now we demonstrate that 
this estimate, owing to the quadratic term on the left, implies a bound from above for 7 on 
any subregion VCCU. 

So take V CC U and select a cutoff function ( : U — > R satisfying 



(34) 



< C < 1, C = 1 on V, 
C = near 9C/. 



Write 

(35) V := C 4 7 



and compute 
(36) 



Vxt = CS* + 4C 3 Cx l 7 



Select a point rr G £/, where 77 attains its maximum. If C( x o) = 0, then r\ = 0. Otherwise 
C(£o) > 0, Xo G £/, and so 

£77(2:0) = 0, D 2 r](x ) < 0. 
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Consequently 

(37) C^7 = -^D( at x . 

So at the point x : 



where 

(38) \R 5 \ <C(( 3 \D~f\+( 2 7). 
Invoking (33) we compute 

(39) aCV<CCV /2 p7l+C + H6, 
i? 6 being estimated as in (38). Now (37) implies 

CV /2 |^7l < C<C 3 7 3/2 

!< 

where we employed Young's inequality with s 



< KV + c, 



a& < ea p + C(e)6 9 ( - + - = 1, a, b, e > 

\p q 

for p = |, q = A. Also 

|i? 6 | < C(C 3 |£ 7 |+C 2 7) 

< CC 2 7 

< fCV + C- 

These estimates and (39) imply 

(40) CV < C at x , 

the constants C, a depending only on ( and the coefficients of the PDE. As rj = £ 4 7 attains 
its maximum over U at xq, (40) provides a bound on rj. 

In particular then, since ( = 1 on V, estimate (21) follows. □ 

3. Harnack inequality 

As earlier noted, the pointwise estimate (21) is quite significant from several viewpoints. 
As a first illustration we note that (21) implies 
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Theorem 3 For each connected region V CC U there exists a constant C , depending only 
on U, V , and the coefficients, such that 

(41) sup w < Cinf u 

v v 

for each nonnegative solution u of 

n 

(42) -^(a\,)x-0«[/. 

Remark. Estimate (41) is Harnack's inequality and is important since it is completely 
independent of the boundary values of u on dU . □ 

Proof. Take V CC W CC U and r > so small that B(x,r) C W for each x G V. Let 
e > 0. Since u = u + e > solves (42), Theorem 2 implies 

(43) sup < C 

w u + e 

for some constant C depending only on W,U, etc. Take any points y,z E B(x,r) C W. 
Then 

I log(u(y) + e) - log(u(2;) + e)| 
< sup B(;Cir) l-D log(w + e) | \y-z\ 



owing to (43). So 
and thus 



< 2CV =: d 
\og{u{y) +e)<C 1 + \og(u(z) + e) 



u(y)+e< C 2 (\og(u(z) + e) 
for Ci := e Cl . Let e — > to deduce: 

(44) max u < C 2 min «. 

S(x,r) B(x,r) 

As V is connected, we can cover V by finitely many overlapping balls {B(xi,r)}f =l . We 
repeatedly apply (44), to deduce (41), with C := C^. □ 

Corollary (Strong Maximum Principle) Assume u G C 2 (U) solves the PDE (42), and U is 
bounded, connected. Then either 

(45) minw < u(x) < maxti (x G U) 

dU dU 
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or else 



(46) u is constant on U. 

Proof. 1. Take M := rn.ax.Qu u, u = M — u. Then 

(47) 



u > on dU. 



Multiply the PDE by u and integrate by parts: 

(48) 9 [ \Du\ 2 dx < - V / a ij u Xi u-.dx = 0. 

Jun{u<o} i =1 ic/ 



Here we used the fact that 

Du 



a.e. on {u > 0} 
-D« a.e. on {u < 0}. 



Then (48) implies Du = a.e. in U. As -u =0 on dU, we deduce 

u~ = in [/ 

and so 

(49) u > in 17. 

This is a form of the weak maximum principle. 

2. Next take any connected V CC U. Harnack's inequality implies 

sup-u < C inf u. 
v v 

Thus either u > everywhere on V or else m = on V. This conclusion is true for each V 
as above: the dichotomy (45), (46) follows. □ 

Remark. We have deduced Harnack's inequality and thus the strong maximum principle 
from our interior pointwise bound (21) on the local production of entropy. An interesting 
philosophical question arises: Are the foregoing PDE computations really "entropy calcula- 
tions" from physics? Purely mathematically the point is that change of variables v = log-u 
converts the linear PDE 

n 

-J2(a ij u Xi ) Xj =0in[/ 
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into the nonlinear PDE 

n n 
i,j=l i,j=l 

which owing to the estimate 

n 

EQ>i-v x .v x . > 9\Dv\ 2 

admits "better than linear" interior estimates. Is all this just a mathematical accident (since 
log is an important elementary function) or is it an instance of basic physical principles 
(since entropy is a fundamental thermodynamic concept)? We urgently need a lavishly 
funded federal research initiative to answer this question. □ 

B. Entropy and parabolic equations 

1. Definitions 

We turn our attention now to the time-dependent PDE 

n 

(1) Ut ~ l"'' "•<•• )•<•, = / in U T 

where U is as before and 

U T = Ux (0,T] 

for some oo > T > 0. We are given 

/ : U T -> R 

and 

A : U -> S n A = ((a 1 -'')); 

and the unknown is 

u = u(x,t) (xeU, <t <T). 

We always suppose u > 0. 

Physical interpretation. We henceforth think of (1) as a heat conduction PDE, having 
the form of (18) from §111. D with 

u = temperature 

(2) { q = -ADu = heat flux, 

/ = heat supply /unit mass 
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and the heat capacity is taken to be 

Cy = I. 



Also, up to additive constants, we have 
(3) 

The local production of entropy is 



u = internal energy /unit mass 
log if = entropy/unit mass. 



n 



u 2 

1,3=1 

In the special case that A — 7, / = 0, our PDE (1) reads 

(4) u t - Au = in U T . 

This is the heat equation. □ 
Definitions. Let t > and V C U he any smooth subregion. We define 

(5) S(t, V) = \ogu(-,t)dx = entropy within V at time t 

Jv 

(6) F(t, V) = J —^—^dx = entropy supply to V at time t 

(7) G(t,V) = f vl (.,t)dx = Sy^ ^^ dx 

= rate of internal entropy generation in V at time t 

f 1 du( t} 

(8) R(t, V) = — — — - - — dS = entropy flux outward across dV at time t. 

J dv u(-,t) dv A 

Lemma For each t > and each subregion V C U we have 

(9) j t S(t, V) = Fit, V) + G{t, V) - R(t, V). 

This is the entropy production equation. 
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Proof. Divide the PDE (1) by u and rewrite: 



u \ u I u l u 

W=l / x . *j=l 



Integrate over V: 



[ —dx + f — — — dS = [ sydx + [ —dx 

Jv u J u OVA Jv JV u 

v—^. ' v u / 



S( t>t; ) fl(t,V) G(t,V) F(t,V). 



□ 



2. Evolution of entropy 

In this section we suppose that 

(10) / > in U x (0,oo) 
and also 

(11) — = on dU x [0,oo). 

du A 

The boundary condition (11) means there is no heat flux across dU : the boundary is insu- 
lated. 

a. Entropy increase 

Define 

S(t) = / \ogu(-,t)dx (t > 0). 
Ju 

Theorem 1 Assume u > solves (1) and conditions (10), (11) /ioW. TTien 

(12) — > on [0,oo). 

at 

The proof is trivial: take V = £/ in the entropy production equation (9) and note that 
(11) implies R(t,U) = 0. 

Remarks, (i) Estimate (12) is of course consistent with the physical principle that entropy 
cannot decrease in any isolated system. But in fact the same proof shows that 



tl-> / $(u(;t))dx 

Ju 
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is nondecreasing, if $ : (0, oo) — > K. is any smooth function with $' < 0, $" > 0. Indeed 



J v ^{u)dx = fy&^utdx 



(13) = 



< o. 

If / = 0, the same conclusion holds if only $" > 0, i.e. $ is convex. 

So the entropy growth inequality (12) is just the special case &(z) = — log z of a general 
convexity argument. Is there anything particularly special about the physical case $(z) = 
— log z? 

(ii) There is a partial answer, which makes sense physically if we change our interpreta- 
tion. For simplicity now take a*- 5 = Sij and regard the PDE 

(14) u t - Am = in U x (0, oo) 

as a diffusion equation. So now u = u(x,t) > represents the density of some physical 
quantity (e.g. a chemical concentration) diffusing within U as time evolves. If V CC U, we 
hypothesize that 

« i5) my*-**)- l& 

which says that the rate of change of the total quantity within V equals the outward flux 
through dV normal to the surface. The identity (15) holding for all t > and all V CC U, 
implies u solves the diffusion equation (14). 

Next think of the physical quantity (whose density is u) as being passively transported 
by a flow with velocity field v = \(x,t). As in §111. A, we have 



— "T" / u dx — [ u t + div(uv)dx, 
dt Jv(t) Jv(t) 



<V(t) JV(t) 

V(t) denoting a region moving with the flow. Then 

(16) u t + div(wv) = 0. 
Equations (14), (16) are compatible if 

(17) v = = -Ds 

u 
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s = logw denoting the entropy density. So we can perhaps imagine the density as moving 
microscopically with velocity equaling the negative of the gradient of the entropy density. 
Locally such a motion should tend to decrease s, but globally S(t) = ^ v s{-,t)dx increases. 



b. Second derivatives in time 

Since ^ > and S is bounded, it seems reasonable to imagine the graph of t i — > S(t) 
this way. 



We accordingly might conjecture that t i— > S(t) is concave and so -^f < on [0, oo). This is 
true in dimension n — 1, but false in general: Day in [D, Chapter 5] constructs an example 
where n — 2, U is a square and 4&S(0) > 0. 

We turn our attention therefore to a related physical quantity for which a concavity in 
time assertion is true. Let us write 



Remark. Owing to (3) we should be inclined to think of J v u(- , t) — u(- , t) log u(-, t)dx as 
representing the free energy at time t. This is not so good however, as $(z) = zlogz — z is 
convex and so if / = 0, t t— > $ v u{-,t) — u(-,t) log u(-,t)dx is nondecreasing. This conclusion 
is physically wrong: the free (a.k.a. available) energy should be diminishing during an 
irreversible process. 

It is worth considering expressions of the form (18) however, as they often appear in the 
PDE literature under the name "entropy": see e.g. Chow [CB], Hamilton [H], and see also 
§V.A following. □ 

Theorem 2 Assume u > solves the heat equation (14), with 



□ 




t 



(18) 




du 
dv 



ondU x [0,oo). 
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(i) We have 

(19) ^ < on [0,oo). 

at 

(ii) // U is convex, then 

(20) ^>0 on [0,oo). 
We will need this 

Lemma. Let U dW 1 be smooth, bounded, convex. Suppose u G C 2 {U) satisfies 

Ou 

(21) =0 on dU. 

ov 

Then 

(22) ^<0on dU. 

ov 

Proof. 1. Fix any point x° on dU. We may without loss assume that x° = and that near 
0, <9£7 is the graph in the e n -direction of a convex function 7 : W 1 ^ 1 — > R, with 

(23) £>7(0) = 
Thus a typical point x G <9£7 near is written as 

(x',7(a0) for ^ e ffi"" 1 , a/ near 0. 
Let /i denote any smooth function which vanishes on dU . Then 

h(x', j(x')) = near 0. 

Thus if i G {l,...,n- 1}, 

hs< + ^7^ = near 0; 

and consequently 

(24) M0) = (i = l,...,ra-l). 

2. Set = g = £J =1 u*^', where z/ = (i/ 1 , . . . , z/ 1 ). Then (21), (24) imply 

(25) UxiXj ^ +u Xj v J x .=0 (i = l,...,n- 1) 
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at 0. Now 

d\Du\ 2 



= 2 V 



and consequently (25) says 

<9z/ 



(26) ^ = -2^^ at0 - 



3. But since <9C/ is the graph of 7 near 0, we have 

vi = ci+id^|2u/2 (j = 1, • • • , n - 1) 



v 



(l+|D 7 | 2 ) 1 /2 
-1 

(l+|D 7 | 2 ) 1 /2- 



So for 1 < i,j < n — 1: 



n-l 



As (21) implies Ma; n (0) = 0, we conclude from (23), (26) that 

d\Du\ 2 £=i 

— ^ = — 2 > M,.tt,.7j.. r < at 0, 

since 7 is a convex function. 

Proof of Theorem 2. 1. In light of (18), 

£H(t) = j uUt logu + u^dx 
= fjj Au log u + Audx 

where we used the no-flux boundary condition |^ = on dU. 
2. Suppose now U is convex. Then 

*H{t) = 1,-2^ + ^ 

r _9Y^« u x . 1 V n \ Du \ 2 . 

since u t = Au. Integrate by parts: 

d 2 H (+\ f r ) Ux i x j Ux i x j A V-xjUxjUxj 

+2 \Du\ 25 ^f^dx 

-2Er 1 - 1 /:,,. " " -"'r/.s. 

t—/i,j=l JdU u 
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Then boundary term is 

\8\Du\ 2 



dS>0, 

au u 



according to the Lemma. Consequently: 



d 2 H(t) > „ r \D 2 u\ 2 oY^» , Wj T 

> 2 r \DW _ 2 \Dufjp^4 \ Du \ 4 dx 

— JU u v? ii 3 



since 



> 0, 

\Du\ 2 \D 2 u\ ( \Dvf\ ( \D 2 u\ 



3/2 J y u l/2 
4 1 ln2 ,|2 



— 2 «S f 2 tt 



c. A differential form of Harnack's inequality 

Again we consider positive solutions u of the heat equation 

(27) u t - Au = in U x (0, oo) 
with 

du 

(28) — = on 8U x [0,oo). 

ov 

We assume [/ is convex. 
Lemma We have 

(29) u 1+ n^\Du\> 



□ 



u ' 2t~ u 2 

This estimate, derived in a different context by Li-Yau, is a time-dependent version of the 
pointwise estimate on 7 from §A.2. Note that we can rewrite (29) to obtain the pointwise 
estimate 

n 

St+ 2t" 7 



where as usual s = \ogu is the entropy density and 7 = is the local rate of entropy 
production. 

Proof. I. Write v = \ogu; so that (27) becomes 
(30) v t -Av= \Dv\ 2 . 
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Set 

(31) w = Av. 
Then (30) implies 

(32) w t -Aw = A(\Dv\ 2 ) = 2\D 2 v\ 2 + 2Dv ■ Dw. 
But 

w 2 = (Av) 2 < n\D 2 v\ 2 , 

and so (32) implies 

2 , 

(33) w t - Aw- 2Dv ■ Dw > -w 2 . 

n 

2. We will exploit the good term on the right hand side of (33). Set 

(34) w = tw + -. 
Then 

w t — Aw — 2Dv ■ Dw = w + t(w t — Aw — 2Dv • Dw) 



(35) 
Now (34) says 

and so 

Thus (35) implies 



> w + -w 2 . 

— n 



w n 

w — — 

t 2t 



2 w 2 nw n 2 
W = ~¥ ~~ ~¥~ + 4t2 



(36) w t - Aw-2Dv-Dw > --w. 



t 



3. Now 



dw ± dm 

dv ~ 1 du 



= t-§-{v t - \Dv\ 2 ) on dU x [0, oo) 



owing to (30), (31). Since g = \% = on dU x [0, oo), 



98 



Also, the Lemma in §B gives 

— I 
dv 

Thus 



Dv\ 2 < on dU x [0,oo) 



dw 

(37) — > on dU x [0,oo). 

av 

4. Fix e > so small that 

Tl 

(38) w = tw + - > on U x {t = e}. 

Then (36)-(38) and the maximum principle for parabolic PDE imply 

w > in U x [e, oo). 
This is true for all small e > 0, and so 

tw + - > in U x (0, oo). 

But iu = Aw = u t - |L>w| 2 = ^ - Estimate (33) follows. □ 

The following form of Harnack's inequality is a 
Corollary. For x 1: x 2 G £7, < t 1 < t 2 , we have 

(39) < ( e^I) M (x 2 ,t 2 ). 

Proof. As before, take i> = \ogu. Then 
(40) 

d(x 2 , t 2 ) - v(xi, ti) = J j-v(sx 2 + (1 - s)xi, st 2 + (1 - s)ii)ds 



= Jg 1 • (z 2 - xi) + f t (t 2 - h)ds 

> Jo ~\ Dv \ 1*2 - aril + (|^| 2 - 2(af2+( ra 1 _s)t 1 ) ) (*2 - *i) ds 

> -f 10 g(l)- 



|a2-ai| 2 
4(t 2 -tl) ' 



Here we used the inequality ab < ea 2 + ^b 2 and the identity 

/ t2 , — — ^ds = log ( — 

Jo ti + s(t 2 -ti) \h 
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Exponentiate both sides of (40) to obtain (39). □ 
3. Clausius inequality 

Day [D, Chapters 3,4] has noted that the heat equation (and related second-order parabolic 
PDE) admit various estimates which are reminiscent of the classical Clausius inequality from 
Chapter II. We recount in this section some simple cases of his calculations. 

We hereafter assume u > is a smooth solution of 

n 

(1) u t - (a ij u Xi ) Xj = in U x (0, oo) 

subject now to the prescribed temperature condition 

(2) u(-,t) = r(t) on dU, 
where r : [0, oo) — > (0, oo) is a given, smooth function. 

a. Cycles 

Let us assume that T > and r is T-periodic: 

(3) r(f + T) = r(t) for all t > 0. 
We call a T-periodic solution of (1), (2) a cycle. 

Lemma 1 Corresponding to each T-periodic r as above, there exists a unique cycle u. 

Proof. 1. Given a smooth function g : U — > (0, oo), with g = r(0) on dU, we denote by u 
the unique smooth solution of 

{ u t-T.Z=M j u Xt ) Xj = inU T 
u = r ondUx [0, T] 
u = g on U x {t = 0}. 

2. Let g be another smooth function and define u similarly. Then 

f t (Jjj{u - u) 2 dx) = 2 j v {ut - u t ){u - u)dx 

= - 2 lu ££j=i a * J ~ ^ ) (^S ~ «^)tfe 
< -20 /^|D(u-u)| 2 da;, 

there being no boundary term when we integrate by parts, asn-M = r- r = 0on dU. A 
version of Poincare's inequality states 



/ w 2 dx <C \Dw\ 2 dx 
Ju Ju 



'U JU 
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for all smooth w : U — > M, with w = 0on Thus 



(u — -u) 2 <ia;^ < — ji j {u — u) 2 dx 



for some \i > and all < t < T. Hence 

(5) / («(., T) - fi(-, T)) 2 da; < e^ T [ (g - gfdx. 

Define A(g) = u(-,T), A(g) = u(-,T). As e _AtT//2 < 1, A extends to a strict contraction from 
L 2 (U) into L 2 (U). Thus A has a unique fixed point g G L 2 (U). Parabolic regularity theory 
implies g is smooth, and the corresponding smooth solution u of (4) is the cycle. □ 

b. Heating 

Let u be the unique cycle corresponding to r and recall from §A.l that 

q = —ADu = heat flux. 

Thus 

(6) Q(t) = - [ q-udS= [ ^dS 

JdU JdU av A 

represents the total heat flux into U from its exterior at time t > 0. We define as well 



(7) 



T + = sup{r(t) | < t < T, Q(t) > 0} 
T~ = m£{r(t) \ 0<t<T, Q(t) < 0} 



to denote, respectively, the maximum temperature at which heat is absorbed and minimum 
temperature at which heat is emitted. 

Theorem 1 (i) We have 

(8) r < o, 

Jo r 

which strict inequality unless r is constant. 
(ii) Furthermore if t is not constant, 

(9) r" <r 



+ 



Notice that (8) is an obvious analogue of Clausius' inequality and (9) is a variant of 
classical estimates concerning the efficiency of cycles. In particular (9) implies that it is not 
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possible for heat to be absorbed only at temperatures below some given temperature To and 
to be emitted only at temperatures above r . This is of course a form of the Second Law. 

Proof. 1. Write v = \ogu, so that 

n n 
i,j=l i,j=l 

Then 

i(J uV (. : t)dx) = fw&dS + fv-ydx 

- r(t) ' 

since u(-,t) = r(t) on dU. As t i— > f(-,t) is T-periodic, we deduce (8) upon integrating the 
above inequality for < t < T. We obtain as well a strict inequality in (8) unless 

/ ^dxdt = 0, 
o Ju 

which identity implies 

\Dv\ 2 dxdt = 0. 



'0 Ju 

Thus i h t) is constant for each < t < T and so 

u(x,t) = T(t) (xeU) 

for each < t < T. But then the PDE (1) implies u t = in XJ T and so 1 1— > r(i) is constant. 
2. We now adapt a calculation from §11. A. 3. If r is not constant, then 



> 



L tfQ+(t)dt-±J Q T Q-(t)dt 



(io) = ± ; o T iQwi+qw ^ _ -L ft \mtm dt 

But the PDE (1) implies 

Q(t)dt = J^ (j^ J u(-,t)dxj dt = 0, 
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since u is T-periodic. Hence (10) forces 

□ 

See Day [D, p. 64] for an interesting estimate from below on r + — r~ . 

c. Almost reversible cycles 

Recall from Chapter II that the Clausius inequality becomes an equality for any reversible 
cycle. Furthermore the model discussed in §11. A. 4 suggests that real, irreversible cycles 
approximate ideal, reversible cycles if the former are traversed very slowly. Following Day 
[D, Chapter 3], we establish next an analogue for our PDE (1). 

Definition. We say a family of functions {r e } < £ <i is slowly varying if there exist times 
{T e }o< e <i and constants < c < C so that 



(11) 



(a) t £ : [0, oo) — > (0, oo) is T £ -periodic 

(b) r £ >c 

(c) T £ <C/e 

I (d) \r £ \<Ce, \f £ \<Ce 2 



for all e > 0, t > 0. 

For any r £ as above, let u £ be the corresponding cycle, and set 



Qe(t)= [ ^(;t)dS. 

JdU av A 



Theorem 2 We have 

(12) / " 9ldt = 0(e) ase^O. 

Estimate (12) is a sort of approximation to the Clausius equality for reversible cycles. 
Proof. 1. Let w — w(x) be the unique smooth solution of 



(13) 



w = on dU, 



and set 

(14) u(x, t) := u £ (x, t) - T £ {t) + w(x)f £ (t) 
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for x G U, < t < T £ . Then (1), (13) imply 
(15) 



«t - T l l j =i( aij u Xi )x j = w(x)f e (t) mU Te 

u = on dU x [0,T e ]. 



Now 



(16) <W0 = hukdS=ij u u £ {;t)dx 



for 



G>M £ 

u \Mt)- (Juwdx)f e (t) + R(t), 



(17) i?(t) = / U t (;t)dx. 

Ju 

2. We now assert that 

(18) / R 2 {t)dt = 0{e 3 ) as e -> 0. 



o 



To verify this claim, multiply the PDE (15) by u t and integrate over C/ Te : 
(1Q) if lu ^ dxdt + if /{/ £ij=i a i >u x .u x . t dxdt 

= Jo" /l7 w TeU t dxdt. 

The second term on the left is 



/ — { - f a^u x .u x .dx \ dt = 0, 



owing to the periodicity of tt e , r e and thus w. The expression on the right hand side of (19) 
is estimated by 

\ Jo lu Utdxdt + C Jq £ \ f £ \ 2 dt 
<\fo fu&tdxdt + Oie*), 



since T £ < C/e, \r £ \ < Ce 2 . So (19) implies 

A 

Jo Ju 



/ u\dxdt = 0(e 3 ), 
Ju 



and thus 

< \U\jf j u u 2 dxdt = 0{e 3 ). 
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This proves (18). 

3. Return now to (16). We have 



lo'^t = \U\jf^-dt 



The first term on the right is zero, since ^ = J^(logr £ ) and r e is T £ -periodic. The second 
term is estimated by 

C|T e |sup|f e | = 0(e), 
and the third term is less than or equal 

rpl/2 / f T e \ V2 



c 



(l' R2dt ) s^ 3/2 = °W. 



according to (18). This all establishes (12). □ 
Remark. Under the addition assumption that \ 'f £ \ < Ce 3 , Day proves: 



where 

See [D, p. 53-61]. □ 
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CHAPTER 5: Conservation laws and kinetic equations 



The previous chapter considered linear, second-order PDE which directly model heat 
conduction. This chapter by contrast investigates various nonlinear, first-order PDE, mostly 
in the form of conservation laws. The main theme is the use of entropy inspired concepts to 
understand irreversibility in time. 

A. Some physical PDE 

1. Compressible Euler equations 

We recall here from §111. C the compressible Euler's equations 

gf+pdivv = 



(1 



PlTt = 'DP 



where v = (f 1 , t> 2 , t> 3 ) is the velocity, p is the mass density, and p is the pressure. 

a. Equation of state 

Our derivation in Chapter III shows that p can be regarded as a function of s (the entropy 
density/unit mass) and v (the specific volume). Now if the fluid motion is isentropic, we can 
take s to be constant, and so regard p as a function only of v. But v = p" 1 and so we may 
equivalently take p as a function of p: 

(2) P = p{p). 

This is an equation of state, the precise form of which depends upon the particular fluid we 
are investigating. 

Assume further that the fluid is a simple ideal gas. Recall now formula (9) from §I.F, 
which states that at equilibrium 

(3) PV 1 = constant, 
where 7 = ^ > 1. 

Since we are assuming our flow is isentropic, it seems plausible to assume a local version 
of (3) holds: 

(4) pv 7 = K, 

k denoting some positive constant. Thus for an isentropic flow of a simple ideal gas the 
general equation of state (2) reads 



(5) p = np 



7 
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b. Conservation law form 



For later reference, we recast Euler's equations (1). To do so, let us note that the second 
equation in (1) says 

/ 3 \ 

= -Pxt (i = 1,2,3). 




Hence 

(p V i ) t = p^ + pvl 



= ~ vi Ej=i (pv j ) Xj ~ P Ej=i v j vi. - p Xi 
-EL^Vl^-Px, (i = l,2,3). 



Therefore we can rewrite Euler's equations (1) to read 
(6) 



p t + div(pv) = 
(pv) t + div(pv <S> v + pi) = 



where v <g) v = ((i;V)) and p = p(p). This is the conservation law form of the equations, 
expressing conservation of mass and linear momentum. 

2. Boltzmann's equation 

Euler's equations (1), (6) are PDE describing the macroscopic behavior of a fluid in 
terms of v and p. On the other hand the microscopic behavior is presumably dictated by the 
dynamics of a huge number iV^ m 6.02 x 10 23 ) of interacting particles. The key point is 
this: of the really large number of coordinates needed to characterize the details of particle 
motion, only a very few parameters persist at the macroscopic level, after "averaging over the 
microscopic dynamics" . Understanding this transition from small to large scale dynamics is 
a fundamental problem in mathematical physics. 

One important idea is to study as well the mesoscopic behavior, which we can think of as 
describing the fluid behavior at middle-sized scales. The relevant PDE here are generically 
called kinetic equations, the most famous of which is Boltzmann's equation. 

a. A model for dilute gases 

The unknown in Boltzmann's equation is a function 

/ : I 3 x I 3 x [0,oo) -> [0,oo), 

such that 

f(x,v,t) is the density of the number of particles 
at time t > and position x G IR 3 , with 
velocity v G M 3 . 
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Assume first that the particles do not interact. Then for each velocity d6R 3 



d_ 
dt 



/ f(x,v,t)dx ) = 0, 
Jv(t) J 



where V(t) = V + tv is the region occupied at time t by those particles initially lying within 
V and possessing velocity v. As §111. A, we deduce 

0=[ f t + dw x {fv)dx = [ f t + v DJdx. 

JV(t) JV(t) 

Consequently 

(7) ft + v- DJ = in I 3 x R 3 x (0, oo) 

if the particles do not interact. 

Suppose now interactions do occur in the form of collisions, which we model as follows. 
Assume two particles, with velocities v and v*, collide and after the collision have velocities 
v' and v' 



V' 

/ 



/ 

/ 



We assume the particles all have the same mass m, so that conservation of momentum and 
kinetic energy dictate: 



(8) 



(a) v + — v' + v'„ 

(b) \v\ 2 + H 2 = \v'\ 2 + |<| 2 , 



where v : v^ : v' : vl E M 3 . 

We now change variables, by writing 



v — v — —aw 
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for w G S 2 (= unit sphere in R 3 ) and a — \v' — v | > 0. Then (8) (a) implies 



Furthermore 



v' — v* = aw. 



\v'\ 2 + \v'\ 2 = \v — aw\ 2 + If* + aw\ 2 



= \v\ 2 — 2av ■ w + a 2 + |t>*| 2 + 2cki>* • w + a 2 . 
Consequently (8)(b) yields the identity: 

a — (v — v*) • w. 

Hence 



(9) 



v' = v — [(v — t>*) • w]w 
u'x = v* + [(v - v*) ■ w]w. 



We next introduce the quadratic collision operator 

(10) Q(fj)(v,-) = is*Ufw,-)m,-)-f(v,-)f(v*,-)} 

B(v — v*,w)dv*dS(w) 

where dS means surface measure over S 2 ,v',v^ are defined in terms of v,v*,w by (9), and 

B:R 3 x S 2 ^ (0,oo) 

is given. We assume also that B(z,w) in fact depends only on \z\, \z ■ w\, this ensuring that 
model is rotationally invariant. 

Boltzmann's equation is the integro/differential equation 

(11) ft + v- D x f = Q(f, f) in I 3 x R 3 x (0, oo), 

which differs from (7) with the addition of the collision operator Q(f, f) on the right hand 
side. This term models the decrease in f(v, x, t) and f(v*,x, t) and the gain in f(v', x, t) and 
f(vl,x,t), owing to collisions of the type illustrated above. The term B models the rate of 
collisions which start with velocity pairs v,v* and result in velocity pairs i>',i>* given by (9). 
See Huang [HU, Chapter 3] for the physical derivation of (11). 

b. //-Theorem 

We henceforth assume / is a smooth solution of (11), with / > 0. We suppose as well 
that / — > as \x\, \v\ — > oo, fast enough to justify the following calculations. Define then 
Boltzmann's H -function 

(12) H(t)= [ [ f log fdvdx (t>0), 

109 



concerning the form of which we will comment later. 
Theorem 1 We have 

(13) ^<0 on[0,oo). 

at 

Proof. 1. Let us as shorthand notation hereafter write 

f = fW,-), /. = /(«.,•), f. = f (<>•)■ 

Thus 

Q(fJ) = [ [ [ffl-ff*]B(v-v.,w)dv.dS. 
Js 2 Jr 3 

2. We now claim that if : K. 3 — > IR is smooth, ip = ip(v), then 
(14) 

f 1>(v)Q(f,f)(v)dv = ± f [ [ (f'fl-f f.)(1> + *!>.- tf-il/ m )Bdvdv m dS. 
Jr 3 4 Js 2 Jr 3 Jr 3 

This identity is valid since 

1. interchanging v with v* does not change the integrand on the left hand side of (14), 

2. interchanging (v,v*) with (v',v' # ) changes the sign of the integrand, and 

3. interchanging (v,v*) with (i>*,i>') changes the sign as well. 

More precisely, write B\ to denote the left hand side of (14). Then 

B 2 := J S 2 Jr3 / r3 ^*(/7* - ff*) B (v - u*, w)dvdv*dS 
= J S 2 J R 3 J R 3 WJ' ~ f*f)B(v* - v, w)dv*dvdS, 

where we relabeled variables by interchanging v and v*. Since B(z,w) depends only on \z\, 
\z • w\, we deduce 

B 2 = B 1 . 



Next set 



B 3 := / / ip'(f'fl-ff*)B(v-v*,w)dvdv*dS. 
Js 2 Jr 3 Jr 3 



For each fixed w G S* 2 , we change variables in M 3 x M 3 by mapping (v,v*) to (v',v*), using 
the formulas (9). Then 



www 



d(v,v*) y w®w I-w®w 
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6x6 



and so 



Consequently 



The integrand is 



d(v',vi] 



I. 



B, 



S 2 JR 3 



d(v,v*) 

^'(fti - ff*)B(v-v*,w)dv'dv'.dS. 



4>(v')(f(v', •)/«, •) - f(v., -)f(v, -))B(v - v.,w) 
and we can now regard v, v* as functions of v', v'^. 

v — v' — [{v 1 — v'^) • w]w 
= v[ + [(v' - v[) ■ w]w. 

Next we simply relabel the variables above, and so write "t>,i>*" for u v f , v'", and vice-versa: 



s 2 



[ [ Mff.~ ffl)B(v' -v'„w)dvdv.dS. 

JR. 3 JR3 



Now (9) implies 



\v — vJ = \v' — v'J 



(v-v*)-w = -(v'-v'J-w; 
and so, since B(z,w) depends only on \z\, \z ■ w\, we deduce 



Similarly we have 
for 



B3 = — B\. 
B4 = — Bi, 



B 4 := / / / tt(ffl - fU)B{v - v*,w)dvdv*dS. 
J s 2 Jr 3 Jr 3 

Combining everything, we discover 

ABi = Bi + B 2 — B 3 — B4, 

and this is the identity (14). 

3. Now set ij}{v) = log/(u, •) in (14). Then 

(15) 

/ R8 log f(v, -)Q(f, f)(v, -)dv = \ f s2 J RS /,,(/'/; - //Opog(//0 - log(/'/:)]S*Kfo.dS 

< 0, 
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□ 



since B > and log is increasing. Also put ip = 1, to conclude 

(16) / Q(fj)(v r )dv = 0. 
4. Thus 

&H(t) = f R3 f R3 f t (\ogf + l)dvdx 

= / R3 f R3 [- v . DJ + Q(f, /)](log/ + l)dvdx 
< - / RS / R3 u • £>x/(log/ + l)du<fo by (15), (16) 

= - /r3 V • (/ R 3 l0g /)<&) rfu 

= 0. 

Remark. A smooth function / : M 3 — > [0, oo), / = /(f), is called a Maxwellian if 

(17) Q(/,/) = 0onR 3 . 
It is known that each Maxwellian has the form: 

(18) f(v) = ae~^- c|2 (y G M 3 ) 

for constants a,b el, c G 1R 3 : see Truesdell-Muncaster [T-M]. 
According to the proof of Theorem 1, we have 

unless v h- > /(x, i>, i) is a Maxwellian for all x G M 3 . This observation suggests that as t — > oo 
solutions of Boltzmann's equations will approach Maxwellians, that is 

(19) /(x, V , t) « d(x, t) e -6(*.*)l«-c(a ! ,t)| 2 ( x? v e R 3) 

for i > 1. □ 
c. if and entropy 

We provide in this section some physical arguments suggesting a connection between the 
H -function and entropy. The starting point is to replace (19) by the equality 

(20) /(x, v, t) = a(x, t) e -&(*,t)l«-c(*,t)| a v e k 3 , t > 0), 

where a, 6 : M 3 x [0, oo) — > (0, oo), c : IR 3 x [0, oo) M 3 . In other words we are assuming that 
at each point x in space and instant t in time, the distribution v i— > /(x, t> , i) is a Maxwellian 
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and is thus determined by the macroscopic parameters a = a(x,t), b = b(x,t), c = c(x,t). 
We intend to interpret these physically. 

(i) It is first of all convenient to rewrite (20): 



(21) f(x,v,t) = j^^e- lV - , 

where n, A, v are functions of (x, t), n, A > 0. Then 

{(a) f Ra fdv = n 
(b) f R3 vfdv = nv 
( c ) /rs \ v - v\ 2 fdv = 3nA. 

Thus (22) (a) says: 

(23) / f(x,v,t)dv = n(x,t), 

where n(x,t) is the particle density at x G M 3 , t > 0. Introduce also 

m = mass/particle. 

Then 

(24) / mf{x,v,t)dv = mn(x,t) =: p(x,t), 
for p(x,t) the mass density. Then (22) (b) implies 

(25) / mvf(x,v,t)dv = p(x,t)v(x,t), 

and thus v(x,t) is the macroscopic velocity. Using (22) (c) we deduce: 

Cm 1 3 

(26) / -\v\ 2 f(x,v,t)dv= -p(x,t)\v(x,t)\ 2 + -p(x,t)\(x,t). 

The term on the left is the total energy at (x, t) (since y |w| 2 is the kinetic energy of a particle 
with mass m and velocity v, and f(x,v,t) is the number of such particles at (x,t)). The 
expression ^p|v| 2 on the right is the macroscopic kinetic energy. Thus the term |pA must 
somehow model macroscopic internal energy. 

(ii) To further the interpretation we now suppose that our gas can be understood macro- 
scopically as a simple ideal gas, as earlier discussed in §I.F. From that section we recall the 
equilibrium entropy function 

(27) S = R\ogV + C v \ogT + S , 
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valid for mole number N — 1. Here 



S = entropy/mole 
Cy = heat capacity/mole. 

Now a mole of any substance contains Avagadro's number Na of molecules: see Appendix A. 
Thus the mass of a mole of our gas is Natu. Define now 

= entropy /unit mass 
= heat capacity/unit mass. 



Then 

(28) 

Recall also from §I.F that 
(29) 

Then (27)-(29) imply: 

(30) 

where 
(31) 



s = S/Natti 
c v = Cv/Natti. 



l= ( £->\,Cp-C v = R. 



R\ogV+C v \ogT , 

N A m " r 60 

R (logV+Ci-lj-MogT) , 

A(log\/ + ( 7 -l)- 1 logT) + So , 



k 



is Boltzmann's constant. We now further hypothesize (as in Chapter III) that formula (30) 
makes sense at each point x G M 3 , t > for our nonequilibrium gas. That is, in (30) we 
replace 



(32) 



s by s(x,t) = entropy density/unit mass 
T by 9(x,t) = local temperature 



V by 



p(x,t) 



volume/unit mass. 



Inserting (32) into (30) gives: 



(33) 



k 



* = -((7 - ir 1 log# - logp) + s - 

m 
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dv 



(iii) Return now to (21). We compute for x G M 3 , t > 0: 

h(x,t) := / R3 f(x,v,t)logf(x,v,t)dv 

= (2#72 / R3 [logn - § log(27rA) - 

= ra (log ra- |logA + r ) , 

r denoting a constant. Since nm = p, we can rewrite: 

(34) /i(z,0 = ^(logp-^logA + r ), 

r now a different constant. 

Comparing now (33), (34), we deduce that up to arbitrary constants 

(35) p(x,t)s(x,t) = -kh(x,t) (xeM 3 , t > 0), 
provided A is proportional to 9, 

(36) X(x,t) = n9(x,t) (k>0), 
and (7 - I)" 1 = 3/2, that is 

(37) 7 =5. 

Making these further assumptions, we compute using (35) that 

5 (0 := /rs s(-,t)dm = J R3 s{;t)p{-,t)dx 
= —k J R3 /i(-, t)da: = —kH(t). 

Hence 

(38) 5(0 = -jfefT(t) (t > 0). 

So the total entropy at time t is just —kH at time t and consequently i/ie estimate (13) t/iat 
dif/ dt <0 is yet another version of Clausius ' inequality. 

(iv) It remains to compute k. Owing to (26), we expect 

\ pX = \ pKB 

to represent the total internal energy at (x, t). That is, \kQ should be the internal energy/unit 
mass. 
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Now we saw in §I.F that the internal energy/mole at equilibrium is CyT . Thus we can 
expect 

(39) ^K0 = c v 9, 

where we recall c v is the heat capacity/unit mass. Thus (28), (39) imply: 

= 2 °v 
K ~ 3N A m' 

But (29), (37) imply C v = \R and so 

(40) k = 



We summarize by returning to (21): 

Iai\ el ±\ ( 171 ^ 3/2 m '"- v ' 2 

(41) f ^= n {^k9) 6 ^ ' 
where 

n(x, t) = particle density at (x, t) 

9(x,t) = local temperature at (x, t) 

v(x,t) = macroscopic fluid velocity at (x, t). 

This formula is the Maxwell-Boltzmann distribution for v. 

Remark. For reference later in Chapter VII, we rewrite (41) as 

P-PH 

(42) / = n— , 
for 





\ " ' 


1 


(43) | 


1 H = 


2\ u v 1 ' 






l2TTk0\ 3 / 2 

\ m J 



IV. 



□ 



B. Single conservation law 

Euler's and Boltzmann's equations are among the most important physical PDE but, 
unfortunately, there is no fully satisfactory mathematical theory available concerning the 
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existence, uniqueness and regularity of solutions for either. Much effort has focused therefore 
on certain simpler model PDE, the rigorous analysis of which presumably provides clues 
about the structure of solutions of Euler's and Boltzmann's PDE. 

In this section we discuss a type of nonlinear first-order PDE called a conservation law 
and describe how entropy-like notions are employed to record irreversibility phenomena and 
in fact to define appropriate weak solutions. Following Lions, Perthame, Tadmor [L-P-Tl] 
we introduce also a kinetic approximation, which is a sort of simpler analogue of Boltzmann's 
equation. In §C following we discuss systems of conservation laws. 

1. Integral solutions 

A PDE of the form 
(1) ut + div F(«) = in R n x (0, oo) 

is called a scalar conservation law. Here the unknown is 

u : R n x [0, oo) -> R 

and we are given the flux function 

F:R^R n , F = (F\...,F n ). 



Physical interpretation. We regard u = u(x,t) (x G lR n , t > 0) as the density of some 
scalar conserved quantity. If V represents a fixed subregion of R n , then 

(2) l^ui-rfdx 

represents the rate of change of the total amount of the quantity within V, and we assume 
this change equals the flux inward through dV: 



(3) - / F • vdS. 

JdV 

We hypothesize that F is a function of u. Equating (2), (3) for any region V yields the 
conservation law (1). Note that (1) is a caricature of Euler's equations (6) in §A. □ 

Notation. We will sometimes rewrite (1) into nondivergence form 

(4) ut + b(u) • Du = in R n x (0, oo), 
where 

(5) b = F', b = (b\...,b n ). 
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We can interpret (4) as a nonlinear transport equation, for which the velocity v = b(u) 
depends on the unknown density u. The left hand side of (4) is thus a sort of nonlinear 
variant of the material derivative ^ from Chapter III. The characteristics of (1) are solutions 
x(-) of the ODE 

(6) x(t) = v(x(t),t) = b(u(x(t),t)) (t > 0). 

□ 

We will study the initial value problem 



(7) 



u t + div F(«) = in R™ x (0, oo) 

u = g on R n x {t = 0}, 



where g G (R n ) is the initial density. The first difficulty in this subject is to understand 
the right way to interpret a function u as solving (7). 

Definition. We say u G Lj Qc (IR n x (0, oo)) is an integral solution of (7) provided: 
(8) / f uv t + F(u) • Dvdxdt + [ gv{; 0)dx = 

JO JR n JR n 

for all v G C^(R n x [0,oo)). 

Examples. 

(a) If 



then 

is an integral solution of 

(9) 

(b) If, instead, 



g(x) = 



u(x, t) 



1 x < 
x > 0, 



1 x<\ 
x > | 



«t+(^) = 0inMx(0,oo) 

u = g on R x {t = 0}. 



x < 

1 x > 0, 
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then 

ui(x,*) 

and 



x<\ 

1 x>\ 



3 x < 
u 2 (x,t) = <( f < a; < t 
1 rr > t 

are both integral solutions of (9). 

As explained in Smoller [S], [El, Chapter III], etc., "the physically correct" integral 
solution of (b) is u<i. The function u from Example (a) admits a "physical shock", with 
which the characteristics collide. The function u\ from Example (b) is wrong since it has a 
"nonphysical shock", from which characteristics emanate. 
(c) If 

1 x < 
g(x) — ^ 1 — x < a; < 1 
x > 1, 



then 



1 x < t, < t < 1 
u(x,t) = { i^f *<x<l, 0<*<1 

x > 1, < i < 1 

1 x < ^, t > 1 
x > ^, t > 1 

is an integral solution of (9). 

The function u is the physically correct solution of (9). Note carefully that although 
g = u(-, 0) is continuous, u(-, t) is discontinuous for times t > 1. Note also that this example 
illustrates irreversibility. If 



g{x) 



1 x<\ 
x > |, 



then the corresponding physically correct solution is 



1 x < i±* 

X > i±^. 



But then u = u for times t > 1. □ 
2. Entropy solutions 
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We next introduce some additional mathematical structure which will clarify what we 
mean above by a "physically correct" solution. 

Definition. We call ($, \f r ) an entropy/ entropy fluxp&ir for the conservation law (1) provided 

(i) $ : R — >• R is convex 

and 

(ii) # : R -> R n , # = . . . , * n ) 
satisfies 

(10) W = b$'. 

Thus 



11) ¥(z)= f b\v)&(v)dv (i = l,...,n) 

Jo 



up to additive constants. □ 

Motivation. Notice that it is a C 1 -solution of (1) in some region of R n x (0, oo), then (10) 
implies 

$(u)t + div flr(u) = 

there. We interpret this equality to mean that there is no entropy production within such a 
region. On the other hand our examples illustrate that integral solutions can be discontinu- 
ous, and in particular Examples (a), (b) suggest certain sorts of discontinuities are physically 
acceptable, others not. 

We intend to employ entropy /entropy flux pairs to provide an inequality criterion, a kind 
of analogue of the Clausius-Duhem inequality, for selecting the proper integral solution. The 
easiest way to motivate all this is to introduce the regularized PDE 

(12) u\ + div F{u £ ) = eAu £ in R n x (0, oo), 

where e > 0. By analogy with the incompressible Navier-Stokes equations ((18) in §111. C) 
we can regard the term "eAu £ " as modelling viscous effects, which presumably tend to smear 
out discontinuities. And indeed it is possible to prove under various technical hypotheses 
that (12) has a smooth solution u £ , subject to initial condition u £ = g on IR n x {t = 0}. 
Take a smooth entropy /entropy flux pair and compute: 

$(u £ ) t + div V(u £ ) = &{u £ )u £ t + W(u £ ) ■ Du £ 

= $'(M e )(-b(M e ) • Du £ + eAu £ ) + V'(u £ ) ■ Du £ 

(13) = £$'(m e )Am by (10) 

= e div($' \u £ )Du £ ) - 8&\u £ )\Du £ \ 2 
< e div{<5>'{u £ )Du £ ), 
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the inequality holding since $ is convex. Now take v G C\ (M n x (0, oo)), v > 0. Then (13) 
implies 

f™ f Rn ®{u E )v t + V(u £ ) ■ Dvdxdt = - f °° f Rn v($(u £ )t+ div(V (u £ ))dxdt 

> -e J °° J Rn v div($' \u £ )Du £ )dxdt 
= e J °° / R „ §'{u e )Dv • Du £ dxdt. 

Assume now that as e — > 0, 

(14) -u £ — > u boundedly, a.e., 
and further we have the estimate 

(15) sup / / £|£>w e | 2 <ir<it < oo. 
Then send £ — > above 

(16) / / $(u)v t + *(u) • .Dwrfxrft > 0. 

./n JR™ 

This inequality motivates the following 

Definition. We say that u G C([0,oo); L 1 (R™)) is an entropy solution of 



(17) 



u t + div F(u) = in R n x (0, oo) 
u = g on W 1 x {t = 0} 



provided 

(18) $(u) t + div *(u) < 

in the weak sense for each entropy /entropy flux pair ($, \1/), and 

(19) u(-,0)=g. 

Remarks, (i) The meaning of (18) is that the integral inequality (16) holds for all v G 
Cl(R n x (0,oo)), v > 0. 

(ii) We can regard (18) as a form of the Clausius-Duhem inequality, except that the sign 
is reversed. Note carefully: if $, ^ is an entropy /entropy flux pair, then s = — $(w) acts like 
a physical entropy density. The customary mathematical and physical usages of the term 
entropy differ in sign. 

(iii) Since we have seen that $(«)<+ div *&(u) = in any region where u is C 1 , the possible 
inequalities in (18) must arise only where u is not C 1 , e.g. along shocks. We motivated (18) 
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by the vanishing viscosity method of sending e — > in (12). This is somewhat related to 
the model thermodynamic system introduced in §11. A. 4 where we added dissipation to an 
ideal dissipationless model. In that setting, the ideal model arises if conversely we send the 
dissipation terms R 1 , R 2 — > 0. By contrast, for conservation laws if we obtain u as the limit 
(14) of solutions u e of the viscous approximations (12), then u "remembers" this vanishing 
viscosity process in the structure of its discontinuities. 

For instance in examples (a)-(b) above, the physically correct shocks do arise as the limit 
e — > 0, whereas the physically incorrect shock cannot. (To be consistent, we should redefine 
the <?'s and thus the -u's for large \x\, so that u e L 1 .) □ 

The existence of an entropy solution of (17) follows via the vanishing viscosity process, 
and the really important assertion is this theorem of Kruzkov: 

Theorem Assume that u, u are entropy solutions of 



(20) 
and 
(21) 
Then 



u t + div F(u) = in R n x (0, oo) 
u = g onR n x {t = 0} 



u t + div F(«) = in R n x (0, oo) 

u = g on W 1 x {t = 0}. 



(22) IK-,*) -«(-,*)IIl 1 (R») < IK" ? S ) -«(-,s)||li(r») 

for each < s < t. 

In particular an entropy solution of the initial value problem (20) is unique. 
See for instance [El, §11.4.3] for proof. 

3. Condition E 



We illustrate the meaning of the entropy condition for n = 1 by examining more closely 
the case that u is a piecewise smooth solution of 

(23) u t + F(u) x = in R x (0,oo). 

More precisely assume that some region V C R x (0, oo) is subdivided into a left region Vi 
and a right region V r by a smooth curve C, parametrized by {x = s(t)}: 
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Assume that u is smooth in both Vi and V r , and further u satisfies the condition 

(24) $(«) t + V(u) x <0inV 

in the weak sense, for each entropy /entropy flux pair ($, \l>). Thus 

(25) J I $(u)v t + V(u)v x dxdt > 
for each v G Cq(V), with v > 0. Here 



(26) 



$ : IR — > IR is convex, and 
= 6$' for 6 = F'. 



First take t> G C£(Vj), t> > 0. Then, since u is smooth in Vj, we can integrate by parts in 
(25) to deduce 

J J [®{u) t + V(u) x ]vdxdt < 0. 



Vi 

This inequality is valid for each v as above, whence 

&(u) t + V(u) x < in Vl 
Take $(z) = ±z, V(z) = ±F(z), to conclude 

ut + F(u) x = in V t . 

But then (26) implies 

(27) $(u) t + = in V x . 
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Similarly 

u t + F(u) x = in V r , 

(28) $(u)t + = in K. 
Next take v G C^F), u > 0. Then (25) says 

/ / $(m)^ + ^{u)v x dxdt + / / $(mM + ^{u)v x dxdt > 0. 

J J V t J J V r 

Integrate by parts in each term, recalling (27), (28), to deduce 

(29) / u[($(u,) - $K))z/ 2 + (tt(u,) - ^K))// 1 ]^ > 
Jc 

where v = (i/ 1 , z/ 2 ) is the outer unit normal to Vj along C, u\ is the limit of u from the left 
along C, and w r is the limit from the right. Since 

V= (l + (s)2)l/2 

and f > is arbitrary, we conclude from (29) that 

(30) s($(u r ) - $(u,)) > - along C. 

Taking &(z) = ±z, *&(z) = ±F(z), we obtain the Rankine-Hugoniot jump condition 

(31) s(w r - Ui) = F(u r ) - F(ui). 

Select now a time t, and suppose -u/ < u r . Fix ui < u < u r and define the entropy /entropy 
flux pair 

= (z-u) + 
ty(z) = f* sgn + (-u — u)F'{v)dv. 



Then 



$(u r ) — &(ui) = u r — u 
*K)-*(u,) = F{u r ) — F{u). 



Consequently (30) implies 

(32) s(u-u r ) <F(u) - F(u r ). 
Combine (31), (32): 

(33) F(u) > F(yUr ^ - F ^ (u - u r ) + F(u r ) {ui<u<u r ). 
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This inequality holds for each ui < u < u r and says that the graph of F on the internal 
[ui,u r ] lies above the line segment connecting (ui,F(ui)) to (u r , F(u r )). 
A similar proof shows that if ui > u r , then 



(34) F(u) < 



F{u r ) - F{ Ul ) 



[u — u r ) + F{u r ) (u r <u<ui) 



U r — U\ 

The inequalities (33), (34) are called Oleinik's condition E. 
Remarks, (i) In particular, 

if F is strictly convex, then ui > u r 

and 

if F is strictly concave, then m <u r . 
(ii) If (33) or (34) holds, then 



□ 



F'(u r ) < s = F{Ur) - F{Ul) < F'( Ul ). 

U r — Ui 

As characteristics move with speed b = F', we see that characteristics can only collide 
with the shock and cannot emanate from it. The same conclusion follows from (34). This 
geometric observation records the irreversibility inherent in the entropy condition (24). 

□ 

4. Kinetic formulation 

Our intention next is to introduce and study a sort of kinetic formulation of the conser- 
vation law u t + div F(-u) = 0. If we think of this PDE as a simplified version of Euler's 
equations, the corresponding kinetic PDE is then a rough analogue of Boltzmann's equation. 
The following is taken from Perthame-Tadmor [P-T] and Lions-Perthame-Tadmor [L-P-Tl]. 

We will study the kinetic equation 

(35) w t + b{y) ■ D x w = m y in fxlx (0, oo), 

where 

w : R n x R x (0, oo) -> R, w = w(x, y, t), 

is the unknown, b = F' as in §2, and m is a nonnegative Radon measure on R n x R x (0, oo). 
Hence 

d 

m y = — m. 
dy 
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We interpret w as solving (35) in the weak (i.e. distribution) sense. 

We can think of y as a variable parameterizing the velocity field v = b(y) and so — 
in analogy with Boltzmann's equation — interpret w(x,y,t) as the density of particles with 
velocity v = h(y) at the point x G R™ and time t > 0. Then 



should represent the density at (x,t). The idea will be to show under certain circumstances 
that u solves the conservation law u t + div F(-u) = 0. 

To facilitate this interpretation, we introduce the pseudo-Maxwellian 

[ 1 if < y < a 



for each parameter a e R. 

As a sort of crude analogue with the theory set forth in §A, we might guess that w, u are 
further related by the formula: 



This equality says that "on the mesoscopic scale the velocity distribution is governed by 
the pseudo-Maxwellian, with macroscopic parameter a = u(x,t) at each point x e R™, time 
t > 0". It is remarkable that this rough interpretation can be made quite precise. 

Theorem 

(i) Let u be a bounded entropy solution of 



(36) 




(37) 




(38) 



w(x,y,t) = Xu(x,t)(y)- 



(39) 



u t + div F(u) 



in R n x (0, oo) 



and define 



(40) 



w(x,y,t) 



Xu(x,t){y) 



(x e W 1 , g e R, t > 0). 



Then 

w E C([0,oo) 
solves the kinetic equation 



L x (R n x R)) n L°°(R£ x (0, oo), L^Ry)) 



(41) 



w t + b(y) ■ D x w = m y in R n x R x (0, oo) 



for some nonnegative Radon measure m, supported in 



R n x [-R ,R ] x (0,oo), 
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where Rq — II 

(ii) Conversely, let w £ C([0, oo); L^M" x R)) nL°°(M™ x (0, oo), L 1 (R y )) sofee (41) /or 
some measure m as above. Assume also w has the form 

w = Xu(x,t)- 

Then 

(42) u(x,t) = / w(x,y,t)dy 

is an entropy solution of (39). 

Proof. 1. First we prove (ii). Let $ : R ^ E be convex, with $(0) = 0. Temporarily 
assume as well that $ is C 2 . Let ip G C£°(R) satisfy 

(43) 



< ip < 1, ip = 1 on [-i? , #o] 
^ = 0onR - [-Rq - 1, Rq + 1]. 



Take w e C^R™ x (0, oo)), v > 0. We employ 

(44) u(a;,*)$W(2/) 

as a test function in the definition of w as a weak solution of the transport equation (41): 

/ °° / R / R „ + wb(y) ■ D x (v&4>)dxdydt 

We must examine each term in this identity. 
2. Now 

Jo°° /r /r» w(v&ip) t dxdydt 
= Jo°° /r» ^ (Jm w®Wy) dxdt. 
By hypothesis u> = Xi*(x,t)j an d therefore 

f R w(x,y,t)<& (y)tp(y)dy = J m Xu(x,t) 
(46) = &(vMv)dy Hu(x,t)>0 

= $(u(x,t)), 

since $(0) = and tp = 1 on [0, w(x, t)]. A similar computation is valid if u(x,t) < 0. Hence 

^ J °° / R / Rn w(v$'ip k ) t dxdydt 

= Jo°° J R nV t $(u)dxdt. 
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Similarly, 

io°° Jr /r» wh (v) " D x (v^'ip)dxdydt 
= Jo°° /r« d * v ' (/r b(y)w$>dy) cfodt. 

Now if u(x,t) > 0, then 

f R wb(y)&i/>dy = / w(x,t) b(y)&(y)i>(y)dy 
= *(u(x,t)), 

for 

(48) ¥(*):= [\{y)&{y)dy. 

Jo 

The same calculation is valid if u(x,t) < 0. Thus 

( 4Q ) Jo 00 Jr Jr« wb (^) ' D x (v&rl>)dxdydt 

= J °° J Rn _Dt> • W(u)dxdt. 

3. We investigate now the term on the right hand side of (45): 

> Jo°° /r» Jr v&i/j'dm, 
since <&" > 0, > 0. Additionally, since ^' = on the support of m, we have 

(51) / / / v&ij>'dm = 0. 

Jo Jr™ Jr 

4. Combine (45), (47), (49), (51), to conclude 

/»oo n 

(52) / / ${u)v t + V{u)- Dvdxdt > 

Jo Jr™ 

for all v as above. An easy approximation removes the requirement that $ be C 2 . Thus for 
each entropy /entropy flux pair we have 

$(u)t + div *(u) < in M n x (0, oo) 

in the weak sense, and consequently u is an entropy solution of (39). 

5. Now we prove assertion (i) of the Theorem. Let u be a bounded entropy solution of 
u t + div F(u) = and define 

(53) w{x,y,t) := Xu( x ,t)(v) (xeR n , yER, t> 0). 

128 



Define also the distribution T on IR n x R x (0, oo) by 

/»oo n n 

(54) (T,<j>):=- / / w{<t> t + h{y)-D x( t>)dxdydt 

JO JR JR n 

for all G C™(R n x 1 x (0, oo)). That is, 

(55) T = w t + b(y) ■ D x w in the distribution sense. 

Observe T = off R n x [-i? , i? ] x (0, oo), where R = \\u\\ L °°. 
Define now another distribution M by 

(56) (M,0):=-(T, [ V <f>(x,z,t)dz) 
for as above. Then 

(57) T = in the distribution sense. 

dy 

6. We now c/aim that 

(58) (M,0)>O 

for all (f) G C c °°(M n xRx(0, oo)) with > 0. 
To verify this, first suppose 

(59) <f>(x,y,t) = a(x,t)/3(y), 
with 

' a > 0, a G C c °°(R n x (0,oo)) 
P > 0, /? e C c °° 



Take 

(60) $(y) ■= (3{w)dwdz. 

J0 J-oo 

Then 

(M,0) = (M,a/?) 

-<7>$') by (56), (60) 

(61) = J °° / R / R „ tu[a t + h(y) ■ D x a]&dxdydt by (54) 

= Jo°° /r» /r Xu^'fat + b(y) • D x a]dydxdt 

= J °° J Rn ®(u)a t + W(u) ■ D x adxdt, 
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where 

^' = cf)'a. 

The last equality results from calculations like those in steps 1,2 above. Now $ is convex 
and so, since u is an entropy solution, the last term in (60) is nonnegative. Thus 



{M,<j>)>Oti<J>(x,y,t) = a(x,t)P(y), 
where a > 0, (3 > 0. 




(62) 

Next, take 

V(x,y,t) = \(x,t)v(y) 
where A, v are smooth, nonnegative and have compact support. Assume also 

rjdxdydt = 1. 

'0 </R</R" 

Let rj £ (-) = ^+2rj(-/e), and then, given e C~(R" x 1 x (0, oo)), > 0, set 

(j) £ (x,y,t) = (rj £ * <f>)(x,y,t) 

= Jo°° /h /r» A ^( a; - * - *Ve(j/ - y)0(^, j7, t)dxdydt. 

We have 

(M,0 £ ) = J °° f R f Rn (M,\ £ v £ )(f)(x,y,i)dxdydt 
> 0, 

owing to (62). Send e — > to establish the claim (58). 

7. Finally we recall that (56) implies M is represented by a nonnegative Radon measure. 
That is, there exists m as stated in the Theorem such that 

(M,<j>) = / / (f>dm. 
Jo Jr Jr 

(See e.g. [E-G].) Thus 

w t + b(y) ■ D x w = T = — = m y 

in the distribution sense. □ 
Remark. For each entropy /entropy flux pair 

$(tt) t + div *(«) < 

in the distribution sense, and so — as above — we can represent 

$(tt)t + div *(u) = -7* 
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where 7* is a nonnegative Radon measure on R™ x (0, 00), depending on $. This measure 
is supported outside of any open regions where u is C 1 , and so records the "change of the 
entropy along the shocks and other places where u is not smooth". 

The measure m on the right hand side of the kinetic equation (41) for w = Xu somehow 
records simultaneously the information encoded in 7* for all the entropies $. □ 

5. A hydrodynamical limit 

To illustrate the usefulness of the kinetic formulation of the conservation law introduced 
in §4, we consider in this section the problem of understanding the limit as e — > of solutions 
of the scaled transport equation 

(63) w £ t + b(y) ■ D x w £ = - w £ ) in f " x 1 x (0, 00) 

e 

where 

(64) u £ (x,t) := / w £ (x,y,t)dy. 
This is a nonlocal PDE for the unknown w £ . 

Physical interpretation. We may think of (63) as a scaled, simplified version of Boltz- 
mann's equation, the parameter e being a crude approximation to the mean free path length 
between particle collisions. The right hand side of (63) is a sort of analogue of the collision 
operator Q(-, •). If we similarly rescale Boltzmann's equation 

f t+v .D x f = -QU e J £ ) 

and send e — > 0, we may expect the particle density f £ (-,v,-) to approach a Maxwellian 
distribution, controlled by the macroscopic parameters p(x,t),v(x,t),9(x,t), which in turn 
should satisfy macroscopic PDE. See for instance Bardos-Golse-Levermore [B-G-L] for more 
on this. This is called a hydrodynamical limit. □ 

Our scaled problem (63), (64) is a vastly simplified variant, for which it is possible to 
understand rigorously the limit e — > 0. First let us adjoin to (63), (64) the initial condition 

(65) w £ = x g on K"xlx{{ = 0}, 

where g : M, n — > K. is a given, smooth function, with compact support. 
Theorem As e — > 0, 

(66) w £ A w weakly * in L°°{R n x 1 x (0, 00)), 
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where w solves 

{w t + b(y) ■ D x w = m y 

w = Xu in R n x R x (0, oo) 

w = Xg on R n x R x {t = 0}, 

for m a nonnegative Radon measure and u the unique entropy solution of 
(68) 



ut + div F(«) = on R n x (0, oo) 
u = gonR n x{t = 0}. 



We say the conservation law (68) is the hydrodynamical limit of the scaled kinetic equa- 
tions (63) as e — > 0. 

Proof (Outline). 1. It is shown in [P-T] that 



(69) 

and further 



{u £ }o<e<i is strongly precompact in 
Lf oc (R-x[0,oo)), 



\w £ \ < 1 a.e., 

(70) { w £ > on {y > 0}, w £ < on {y < 0}. 

supt(w £ ) C R n x [-R ,R ] x (0,oo), 

where R = We will use these facts below. 

2. We now claim that we can write 

(71) - £ (Xuc ~ w £ ) = m £ y , 

for some nonnegative function m £ supported in R n x [—R ,R ] x (0, oo). To confirm this, 
fix — R < a < Rq and assume h G L°°(R) satisfies 



(72) 



supt(/i) C [-R ,Ro], 
-l</i<0ify<0 
0<h<lify>0 
k J R hdy = a 

Then 

(73) Xa(y) - My) = ( a - e - 1/ G 
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for 

rv 

1 ' "iz. 



/y 
Xa{z) = h(z)dz 
-oo 



Recall that 

{1 if < z < a 

-1 ifa<,2<0 

otherwise. 

Thus if a > 0, we deduce from (72), (73) that 

q' > for a.e. — oo < y < a 
q' < for a.e. a < y < oo 

and the same inequalities are true if a < 0. Furthermore q(—R ) = and 



q(Ro) = f^LXa(z)-h(z)d 



z 



--Ro 

= a — f 00 /icb = 0. 

Hence 

(74) g > on R. 

3. Recall (72) and apply the results in step 2 to 

h(y) = w £ (x,y,t), 

a = u £ (x,t) = J R w £ (x,y,t)dy. 

According to (70), this choice of h satisfies conditions (72). Then (73), (74) say 

e y 

where 

supt(m £ ) C [-R ,Ro], m £ > 

for each (x, t). This is assertion (71). 

4. Next we assert: 

(75) sup ||m £ || L i( M n xRx{0iOo)) < oo. 

0<e<l 
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A formal calculation leading to (75) is this: 

io°° Ir /r» m£ dxdydt = J °° f Rn J R m £ (y) y dydxdt 

= ~ Jo°° /r» J R m £ y ydydxdt 
= ~ Jo°° /r» Jr( w * + b (l/) ' D x w e )ydydxdt 
= / R » I R w £ (x,y,0)ydydx 

(y)ydydx by (65) 
= /ir™ ycfe < oo. 

We omit the detailed proof of (75). 

5. Employing now (69), (70), (75) we extract a sequence e r — > so that 

u> weakly * in L°° 
M £r -> it strongly in L} qc 
m £r — ^ m weakly * as measures. 

Hence 

(76) w t + b{y) ■ D x w = m y in R n x R x (0, oo) 

in the weak sense. Furthermore 

Xue - w £ = em £ y; 



c 

oo 



and so for each G C c °°(R n xKx(0, oo)), 

poo p p 

0(x« £ — w £ )dxdydt — —el / / 4> y dm £ — > 0. 
Consequently 

(77) x„sr — 1 u> weakly * in 




Now 



1 if < y < u £r 
V,,:, (//) = <( -1 if u £r < y < 
otherwise, 



and so 



Ix^r- - XuMy = \u £r - u\ 

Since -u £r — > -u strongly in -^j QC , we see 
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Thus 



w = Xu- 



Hence (67) holds and so, according to the kinetic formulation in §4, u solves the conservation 
law (68). □ 

C. Systems of conservation laws 

A system of conservation laws is written 
(1) u t + div F(u) = in R n x (0, oo), 

where the unknown is 

u : W 1 x [0, oo) -> R m , u = (u\ . . . , u m ) 

and 



.. F 1 \ 

n ' 

\ ... F™ / 



is given. 



Notation, (i) We can rewrite (1) into the nondivergence form 
(2) u t + F(u) T : Du = in R n x (0, oo) 

for 

B = DF, B : R m — > L( 

We sometimes write F = F(z), B = B(z) for z G R 
(ii) In terms of the components of u, (1) says 



nm T\/rrmxn\ 



(3) 

and (2) means 
(4) 



+ E(^(u))x, = (fc = l,...,m) 



i=l 



+ EE^<=» (*=l,...,m). 



1=1 Z=l 



□ 



□ 
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We are interested in properly formulating the initial value problem 

(5) 



u t + div F(u) = in R n x (0, oo) 

u = g on R n x {t = 0}, 



where 

g:R B ->ir, g = (g\...,g m ) 

is given. 

1. Entropy conditions 
Definition. We say u e Lj Qc (R n x (0, oo); lR m ) is an integral solution of (5) provided 

(6) f f u • v t + F(u) : Dvdxdt + ! g • v(-, 0)dx = 

for each v e C^K™ x [0, oo); R m ). 
Notation. We write v = (v 1 , . . . , t> m ), 

■u 1 ... "U 1 



Dv = 



"xi • • • "i„ 



□ 



As for scalar conservation laws this is an inadequate notion of solution, and so we intro- 
duce this additional 

Definition. We call ($,\l/) an entropy/ 'entropy flux pair for the conservation law (1) provided 

(i) $ : R m -> R is convex 

and 

(ii) # : R m -> M n , # = . . . , ^ n ) 
satisfies 

(7) = B£>$. 
Notation. The identity (7) means: 

(8) Kk = J2^ Zl (l<t<n, l<k<m). 

i=i k 
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Motivation. Suppose u is a C 1 solution of (1) in some region of M n x (0, oo). Then 
(9) $(u) t + div *(u) = 

there. Indeed, we compute: 

$( U ) t + div^H = eLi^^k 



+Er=iEr=i^(u)< 

+ ELi £"=i *k( u )^ according to (4) 

ELi Er=i (- zr=i ^(u)^ + < 

0, owing to (8). 



□ 



Unlike the situation for scalar conservation laws (i.e. m — 1), there need not exist any 
entropy /entropy flux pairs for a given system of convservation laws. For physically derived 
PDE, on the other hand, we can hope to discern at least some such pairs. 

2. Compressible Euler equations in one space dimension 

We return to §A.l and consider now the compressible, isentropic Euler equations in one 
space dimension. According to (6) in §A.l, the relevant PDE are 

{Pt + (pv) x = 
in R 1 x (0,oo), 
{pv) t + {pv 2 +p) x = 

where p is the density, v the velocity and 

(11) P = p(p) 

is the pressure. Observe that (10) is of the form 

u t + (F(u)) s = 

for m — 2, 



(12) 



u = (p,pv) 

F = (z 2 ,z^/z 1 +p(z 1 )). 
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Remark. We have 

( 1 

The eigenvalues of B are 



4/ z i+p'( z i) W^i 



assuming 

(13) p' > 0. 
Reverting to physical variables, we see 

A± = v ± ( P '(p)) 1/2 . 
It follows that the speed of sound for isentropic flow is 

P(P) 1/2 - 

a. Computing entropy /entropy flux pairs 

We attempt now to discover entropy /entropy flux pairs ($, \I/), where to simplify subse- 
quent calculations we look for $, \I> as functions of (p, v) (and not (u 1 ,!! 2 ) = (p,pv)). Thus 
we seek 

$ = $(p,v), ^ = #(p,v) 

such that 

(14) the mapping (p, pv) \— > $ is convex 



□ 



and 
(15) 



$t + * x = in any 

region where (p,v) are (^-solutions of (10). 



(16) 



So let us assume (p, t>) solve (10), which we recast into nondivergence form: 

pt + pxV + pv x = 



v t + w x = - l p Vx = -p ,R f- 
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Observe that the second line here is just the second line of formula (1) from §A.l. So if 
$ = $(p, v), ^ = \I/(p, we can compute: 

&t + V x = <S> p pt + <S> v v t + it> p p x + y v v x 

= <S> p (-p x v - pv x ) + $ v (-vv x - j/efj 

+^ pPx + V v v x 
= p x <& p -v<$> p - p -<$> v 

+v x [V v -p$ p -v$ v ]. 

Consequently, $j + \& x = for all smooth solutions (p,v) of (15) if and only if 



(17) 



V v = p<S> p + v$ v . 
We proceed further by noting ^ pv = ^ vp : 



Hence 



and consequently 



(v& p + ?-$A =(p$ p + v$ v ) f 

V P J V 



p' 

$p + v$ pv + -$ w = $ p + p<$> pp + v$ vp , 



(18) $ pp = (p > 0, o6 R). 

In summary, if $ solves (18) and we compute \1/ from (17), then (<3>, VP) satisfies c&i + ^.j, = 
0, whenever (p, t>) are smooth solutions of Euler's equations (10). Since p' > 0, (18) is a 
linear nonhomogeneous wave equation. 

Definition. $ is called a weak entropy function if $ solves (18), with the initial conditions 

(19) $ = 0, $ p = £ on R x {p > 0}, 

for some given g : R — > R, # = g(v). 

To go further, let us take from §A.l the explicit equation of state 

(7-1) 2 

(20) p(p) = /tp 7 , where k = — — , 7 > 1 
the constant k so selected to simplify the algebra. 
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Lemma (i) The solution of (17), (18) for 

9 — ^{0} — Dirac mass at the origin 



is 



(21) X(p,v) = (p^ 1 - v%, X = 



.7-1 „2NA ^ 3-7 



2(7-1)' 

(ii) The general solution of (17), (18) is 

(22) $(p, u) = / g(y) X (p, V - v)dy (p > 0, v E R). 

(iii) Furthermore, $ defined by (21) is convex in (p, pv) if and only if g is convex. 

(iv) TTie entropy flux \& associated with $ zs 

(23) #(p, u) = / g(y)(9y + (1 - #V)x(p, y ~ v)dy 

Jr 

for 6 = ^. 

See [L-P-T2] for proof. We will momentarily see that we can regard x as a sort of 
pseudo-Maxwellian, parameterized by the macroscopic parameters p, v. 

Example. Take g(v) = v 2 . Then 

(0A , *(p,v) = ky^p'^-iy-vfY + dy 

{ } = > 2 + ^V- 

The term \pv 2 is the density of the kinetic energy, and ^~[P 7 is the density of the internal 
energy. Hence $ is the energy density. If (p, pv) is an entropy solution of (10), then 

$t + * x < 0, 

and so 

f 1 k 

(25) sup / -p(x, t)v 2 (x, t) H -p 1 (x,t)dx < oo, 

t>o Jr 2 7-1 

provided the initial conditions satisfy this bound. □ 
b. Kinetic formulation 
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Theorem Let (p,pv) G L°°((0, oo); L 1 (R, R 2 )) /wye ,/im£e energy and suppose p > a.e. 
TTien (p,pv) is an entropy solution of Euler's equations 

{Pt + (pv) x = 
mix (0,oo) 
(pv) t +(pv 2 +p) x = 

if and only if there exists a nonpositive measure m on R x R x (0, oo) snc/i £/ia£ 

(27) w = x(p, y-v) (p = p(x, t), v = v(x, t), t/Gl) 
satisfies 

(28) w t + [(0y + (1 - 0)v)w\ x = m yy in R x R x (0, oo). 

We call (27), (28) a kinetic formulation of (26). 
Proof. I. As in §B.4 define the distributions 

(29) T = w t + [(6y + (l-e)v)w] x 
and 

<9 2 M 

(30) W =T. 

2. Take $, \1> to be a weak entropy/entropy flux pair as above. That is, 



= f*9(y)x(p,y -v)dy 
y(p^) = J R g(y)(e y + (l-e)v) X (p,y-v)dy. 



Then 



(31) $ * + V x = [ g(y)(w t + [(6y + (1 - 9)v)w\ x )dy. 

Jr 

Suppose now 



<P(x,y,t) = a(x,t)/3(y) 



where 



a > 0, a E C, 
P > 0, P e C\ 



oo 
c 



oo 
c " 



Take g so that 

(32) g" = (3. 
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Then (30) implies 

- J °° / R $a t + $a x dxdt = J °° j R f R ag(w t + [(By + (1 - 6)v)w] x )dxdydt 

= (T, ag>) 

= (M,a/3)by(29), (31) 
= (MA)- 

3. Now if (p, pt> ) is an entropy solution, then 



(33) / <$>a t + ^a x dxdt > 

Jo Jr 

since a > 0, and thus (M, 0) < 0. This holds for all = a (3 as above and so, as in §B.4, 

(34) (M, 0) < for all G C c °°, > 0. 

Thus M is represented by a nonpositive measure m. Conversely if (33) holds, then (32) is 
valid for all a > 0, a E C\. 
4. Lastly note the estimate 

Jo IrIr^ 171 = 2 Jo Ir IyS.y 2 )yyd m 

= \ Jo°° Ir Ir y 2 l Wt + [( V + ( X - B)v)w] x dxdydt 
= Uo°° L^t + ^xdxdt 

= lJ R <S>( P (;0),v(;0))dx 



2 

< OO 



1 / k |p(-,om-,o) 2 + -^p(-,o)^ 



□ 



See Lions-Perthame-Tadmor [L-P-T2] and Lions-Perthame-Souganidis [L-P-S] for re- 
markable applications. 
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CHAPTER 6: Hamilton-Jacobi and related equations 



A. Viscosity solutions 



A PDE of the form 



(1) 



u t + H(Du) = in R n x (0, oo) 



is called a Hamilton-Jacobi equation. The unknown is 



u : R n x [0, oo) -> R 



and the Hamiltonian 



H : M ; 



n 



is a given continuous function. Here Du = (D x u) = (u xi , . . . , u Xn ). 

In this short chapter we introduce the notion of viscosity solutions of (1), which are 
defined in terms of various inequalities involving smooth test functions. The relevant theory 
will not seem to have anything much to do with our ongoing themes concerning entropy and 
PDE, but connections will be established later, in Chapter VIII. 

Following CrandalELions [C-L] and [C-E-L] let us make the following 



Definition. A bounded uniformly continuous function u is called a viscosity solution of (1) 
provided for each v G C°°(IR n x (0, oo)) 



Motivation. If u happens to be a C l solution of (1) in some region of M. n x (0, oo), then in 
fact 

v t (x , t ) + H(Dv(x , t )) = 

at any point in that region where u — v has a local maximum or minimum. This follows 
since u t — v t , Du = Dv at such a point. 





and 
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The interest in (2), (3) is consequently the possibility of the inequalities holding at points 
where u is not C 1 . This is all obviously some kind of vague analogue of the theory from 
Chapter V. As in that chapter let us motivate (2), (3) by the vanishing viscosity method. 
So fix e > and consider the regularized PDE 

(4) u\ + H(Du £ ) = eAu £ in R n x (0, oo). 
Let us assume that as e — > 0, 

(5) u s — > u locally uniformly 

and further suppose for some v G C°° that u — v has a strict local maximum at some point 
(x , t ) £ K n x (0, oo). Then u — v has a local maximum at a nearby point (x £ , t £ ), with 

(6) (x E , t £ ) -> (x , to) as £ -> 0. 

As f and our solution w £ of the regularized problem (4) are smooth, we have 

(7) u\ = v t , Du £ = Dv, D 2 u £ < D 2 v at (x e ,t e ), 

the third expression recording the ordering of symmetric matrices. Then 

v t (x e , t £ ) + H{Dv{x £ , t e )) = u\{x e , t £ ) + H{Du £ {x £ , t £ )) by (7) 

= eAu e (x £ ,t £ ) by (4) 
< eAv(x £ ,t £ ) by (7). 

Let e — > and recall (6): 

t>t(^o, to) + H(Dv(x , t )) < 0. 

It is easy to modify this proof if u — v has a local maximum which is not strict at (x , t ). A 
similar proof shows that the reverse inequality holds should u — v have a local minimum at 
a point (xo, to). 

Hence if the u £ (or a subsequence) converge locally uniformly to a limit u, then u is a 
viscosity solution of (1). This construction by the vanishing viscosity method accounts for 
the name. 3 □ 

We will not develop here the theory of viscosity solutions, other than to state the funda- 
mental theorem of Crandall-Lions: 



Theorem Assume that u, u are viscosity solutions of 
(8) 



u t + H(Du) = in W 1 x (0, oo) 
u = g on R n x {t = 0} 



3 In fact, Crandall and Lions originally considered the name "entropy solutions". 
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and 

(9) 

Then 



u t + H(Du) = in W n x (0, oo) 

u = g onW 1 x {t = 0}. 



\\u(-,t) -U(-,*)IU°°(R") ^ ll M ('; s ) -«(->s)IU<»(R») 

/or eac/i < s < t. 

In particular a viscosity solution of the initial value problem (8) unique. See [C-E-L] 
for proof (cf. also [El, §10.2]). 

B. Hopf— Lax formula 

For use later we record here a representation formula for the viscosity solution of 



u t + H(Du) = in R" x (0, oo) 

u = g on W 1 x {t = 0}, 



in the special case that 

(2) H 
and 

(3) g : R n - 
We have then the Hopf-Lax formula: 

u(x, t) = inf < tL 



is convex 



is bounded, Lipschitz. 



+ g{y)\ {xeR n , t > 0) 



t 

for the unique viscosity solution of (1). Here L is the Legendre transform of H: 

L(q) = sup {p-q- H(p)}. 



See [El, §10.3.4] for a proof. We will invoke this formula in §VIII.C. 

C. A diffusion limit 

In Chapter VIII we will employ viscosity solution methods to study several asymptotic 
problems, involving — as we will see — entropy considerations. As these developments must 
wait, it is appropriate to include here a rather different application. We introduce for each 
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e > a coupled linear first-order transport PDE, with terms of orders O Q), O (^) and 
study under appropriate hypotheses the limit of w e as e — » 0. This will be a diffusion limit, 
a sort of twin of the hydrodynamical limit from §V.B.5. 

1. Formulation 

Our PDE system is 

(1) w £ + -BDw £ = \cw £ in R n x (0, oo). 

The unknown is 

w e : R n x (0, oo) -> M m , w e = (u? 1 '", . . . , w m ' £ ). 
Notation. In (1) we are given the matrix 

C = ((c„)) rrtxm 

and also 

B= b m ), 

where the vectors {6 fc }™ =1 in IR n are given, b k = (b\, . . . , b k ). In terms of the components of 
w e , our system (1) reads: 



(2) w*' £ + -b k ■ Dw k ' £ = \Y 



, , c kl w l > £ 
e e z 

i=i 



for k — 1, . . . , m. □ 

Remark. We can think of (2) as a variant of the PDE (63) in §V.B.5, where the velocity 
parameterization variable y is now discrete. Thus (1) is a scalar PDE for w £ = w £ (x,y,t), 
ye{l,...,m}. □ 

The left hand side of (1) is for each k a linear, constant coefficient transport operator, 
and the right hand side of (1) represents linear coupling. As e — > 0, the velocity ^b k on the 
left becomes bigger and the coupling on the right gets bigger even faster. What happens 
in the limit? 

To answer we introduce some hypotheses, the meaning of which will be revealed only 
later. Let us first assume: 

m 

(3) c kl > if k ^ I, c « = °- 

i=i 
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It follows from Perron-Frobenius theory for matrices that there exists a unique vector 

7T = (7Ti, . . . , 7l m ) 

satisfying 

7r fe > (k = 1, . . . , m), XT=i ^fc = !> 

(4) <( and 

(See for instance Gantmacher [G] for Perron-Frobenius theory, and also look at §VIII.A 
below.) 

We envision it as a probability vector on Q = {1, . . . , m} and then make the assumption 
of average velocity balance: 

m 

(5) J2*kb k = 0. 

k=i 

2. Construction of diffusion coefficients 

Our goal is to prove that for each k e {1, . . . , m}, w k,£ — > u as e — > 0, m solving a diffusion 
equation of the form: 

n 

u t - ^ (i,jii.r,.rj = in R n x (0, oo). 

We must construct the matrix A = ((a^)). First, write 1 = (1, . . . , 1) e ffi m . Then (2), (3) 
say 

(6) Ct = 0, C*n = 0. 

Perron-Frobenius theory tells us that the nullspace of C is one-dimensional and so is spanned 
by 1. Likewise tt spans the nullspace of C*. In view of (5), for each j e {1,. . . ,n}, the 
vector bj = (6j, . . . ,6™) e M m is perpendicular to the nullspace of C* and thus lies in the 
range of C. Consequently there exists a unique vector ali G M m solving 

(7) Cd 3 = -b 3 (j = l,...,n), 
normalized by our requiring 

d j -l = (j = l,...,n). 
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We write dj — (dp . . . , d™), and then define the diffusion coefficients 

m 

(8) a* = 5> fc &?4 (l<i,j<n). 

k=i 

Lemma The matrix A = ((a y )) is nonnegative definite; that is, 

n 

(9) aij ^j ^ for each £ G Rn - 

Proof. Take f = (&,..., f„) e R n and write 

n 

Observe further that (7) says 

m 

(10) 6? = -J^c fc ,di (1 < A; < m, 1 < i < ra). 
Consequently (8) implies 

= - E^=i E3Jj=i ^kCkid^id^j 

(11) = - EJ=i ^kCkivm 

= -E™i=iWi. 

for 

Sfc« := g (1 < fc, / < m). 

The matrix S = ((sfcz)) m xm is symmetric, with > (A; 7^ /) and 

St = 

owing to (5). Since obviously the entries of 1 = (1, . . . , 1) are positive, Perron-Frobenius the- 
ory asserts that every other eigenvalue of S has real part less than or equal to the eigenvalue 
(namely 0) associated with 1. But as S is symmetric, each eigenvalue is real. So 

A < 

for each eigenvalue of S. Consequently (10) follows from (11). □ 
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3. Passage to limits 

Assume now 



g : R™ 



is smooth, with compact support. We introduce the initial value problem for (1): 



(12) 



w 



k,e | 1 



+ \b k ■ Dw k > £ = 4, YT =l c k iw l ' £ in R n x (0, oo) 



w k ' £ = g on R n x {t = 0} 

for k — 1, . . . , n. Linear PDE theory implies there exists a unique smooth solution w £ . 
Theorem As e — > 0, we have for k = 1, . . . , m: 

(13) w k ' £ — > u /oca/Zy uniformly in R™ x [0, oo) 

where u is the unique solution of 



(14) 



- EIj=i , = in R n x (0, oo) 

u = g on R n x {t = 0}. 



Remark. We are asserting that each component w k ' £ of w £ converges to the same limit 
function u : IR n x [0, oo) — > R and that -u solves the diffusion equation (14). 

Proof. 1. See [E2] for a proof that 

{w £ }o< e <i is bounded, uniformly continuous 
on compact subsets of R™ x [0, oo). 

Thus we can find a subsequence e r — > such that 

w £r — > w locally uniformly, 
W = (lU 1 , . . . , w m ). 

2. We first claim that 
(15) w 1 ^^ 2 ^...^^ 
at each point x G R n , t > 0, or — in other words — 

w = wl 
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for some scalar function u = u(x, t). To verify this, take any v G C£°(R n x (0, oo); R m ) and 
observe from (11) that 



It follows that 



poo r 

/ / v(Cw £ )M = 0(e). 

JO JR n 
poo r 

/ / v • (Cw)dxdt = 
Jo Ju n 



for all v as above and so Cw = in R n x (0, oo). Since the nullspace of C is the span of 1, 
(14) follows. 
3. Thus 

(16) w k ' £r — > u locally uniformly (k = 1, . . . , m). 
We next claim that 

(17) u is a viscosity solution of (14). 
This means that if v e C 2 (R n x (0, oo)) and 



(18) 
Then 



u — v has a local maximum (resp. minimum) at 
a point (x ,t ) 6l"x (0, oo), 



(19) v t (xo,t ) - ^2 a tJ v XlXj (x ,t ) < (resp. > 0). 

4. To prove this, let us take v as above and suppose u — v has a strict local maximum at 
some point (x ,t ). Define then the perturbed test functions 

v e := {v 1 ' 6 ,...^™' 6 ), 

where 

n 

(20) v k ' £ :=v-eJ2djV Xj (k — 1,... ,m), 

the constants ci^ (1 < j < n, 1 < k < m) satisfying (7). Clearly 

(21) v k,Er — > f locally uniformly (k — 1, . . . ,m). 
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Since u — v has a strict local maximum at (x ,to), it follows from (16), (21) that 

w k,£ — v k,£ has a local maximum near (x , ^o) 
at a point (x k , t k ) (k — 1, . . . ,m), 



(22) 



for e = e r , and 

(23) (x k , t k ) -> (x , to) as £ = e r -> 0, fc = 1, . . . , m. 

Since w £ and v e are smooth functions, it follows from (22) and the PDE (12) that: 



Ckiw £ ' 1 



1 1 

(24) v k ' £ + -b k ■ Dv Ke = ^ ' 

£ £ i=i 

at the point (x k ,t k ), e = e r . Recalling (20), we conclude from (24) that 

v t (x , t ) - E"j=i b k d k v XiXj (x , t ) 

(25) = -lEt^v Xi (x k ,t k ) 

+ ^ET=lCMW^(x k ,t k )+o(l) 



as e — e r — > 0, k — 1, . . . , to. 



5. Now since w £> — v 6 ' has its local maximum near (xo,to) at (a^,^), we have 

(26) (w £ ' 1 - ^)(4,4) > - ^)(^), 

£ = e r . Recalling that c kt > for k ^ /, we can employ the inequalities (26) in (25): 

v t (x ,t ) - T,lj=ib k d k v XiXj (xo,t ) 
< -lY™ =1 b k v Xi (x k ,t k ) 

+ h Tti ^ i( w£ ' 1 - « e,, )(4> 4) + 4) - ^ ELi **)] 

+ o(l). 

But (10) says YmLi c kid\ = —b k , and so the O Q) terms in the foregoing expression cancel 
out. Thus 

v t (x , t ) - E"j=i b k d k v XiXj (xq, t ) 

< -v £ ' l )(4A) + v(x k ,t k )\. 

Multiply by n k > and sum k — 1, . . . ,m, recalling (2), (3) to deduce: 

v t (xo,t ) - E"i=i \^2^kb k d k \ v x . x .(x ,t ) 



,fe=i 



< 0(1). 
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Let e = e r — > to derive the inequality (19). A simple approximation removes the require- 
ment at u — v have a strict maximum, and a similar argument derives the opposite inequality 
should u — v have a minimum at (x , t ). □ 

Commentary. The linear system (12) for each fixed e > represents a system of linear 
transport PDE with simple linear coupling. This PDE is reversible in time and yet the 
diffusion equation (14) is not. The interesting question is this: where did the irreversibility 
come from! Section VIII. A will provide some further insights. See also Pinsky [P] for other 
techniques, mostly based upon interpreting (12) as a random evolution. □ 
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CHAPTER 7: Entropy and uncertainty 



In this and the subsequent chapter we consider various probabilistic aspects of entropy and 
some implications for PDE theory. The present chapter is a quick introduction to entropy 
in statistical mechanics. 

A. Maxwell's demon 



Let us begin with a simple physical situation, consideration of which will soon suggest 
that there is some kind of connection between entropy, information, and uncertainty. 




initial state 




final state 



Take one mole of a simple ideal gas, and suppose it is initially at equilibrium, being held 
by a partition in half of a thermally insulated cylinder. The initial volume is Vi, and the 
initial temperature is Tj. We remove the partition, the gas fills the entire cylinder, and, after 
coming to equilibrium, it has final volume Vf, final temperature Tf. 

What is the change of entropy? According to §I.F, we have 



(1) 



Si = CylogTi + RlogVi + So 
S f = C v \ogT f + R\ogV f + S , 



so being an arbitrary constant. As there is no heat transfer nor work done to or from the 
exterior, the internal energy is unchanged. Since, furthermore, the energy depends only on 
the temperature (see §I.F), we deduce 

T = T f . 
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As Vf = 2Vi, we deduce that the change of entropy is 

S f -Si = R\og2 > 0, 
in accordance with the Second Law. The mole of gas contains Na molecules, and so 
(2) change of entropy/particle = /clog 2, 

since k = R/Na- 

As the last sentence suggests, it is convenient now to shift attention to the microscopic 
level, at which the gas can be thought of as a highly complex, random motion of Na molecules. 
We next imagine that we reinstall the partition, but now with 

(a) a small gate 

and 

(b) a nanotechnology-built robot, which acts as a gatekeeper. 




Our robot is programmed to open the door whenever a gas molecule approaches the door 
from the right, but to close the door if a gas molecule approaches from the left. After our 
robot has been at work for awhile, we will see more particles in the left region than in the 
right. This is close to our initial situation. 
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The effect of our tiny robot has thus been to decrease the entropy, with a very small 
expenditure of energy on its part. We have here an apparent contradiction of the Second 
Law. 

Maxwell in 1867 proposed this thought experiment, with an intelligent creature (called 
"Maxwell's demon" by Kelvin) in place of our nanoscale robot. Generations of physicists 
have reconsidered this problem, most notably L. Szilard [SZ], who argued that the Second 
Law is not violated provided the overall entropy of the system increases by k log 2 each time 
the robot measures the direction of an incoming molecule in order to decide whether or not 
to open the gate. As (2) presumably implies the entropy decreases by k log 2 once a particle is 
trapped on the left, the Second Law is saved, provided — to repeat — we appropriately assign 
an entropy to the robot's gaining information about molecule velocities. 

We will not attempt to pursue such reasoning any further, being content to learn from 
this thought experiment that there seems to be some sort of connection between entropy 
and our information about random systems. 

Remark. The book [L-R], edited by Leff and Rex, is a wonderful source for more on 
Maxwell's demon, entropy concepts in computer science, etc. See also the website 
www . math . Washington . edu/~hillman/ entropy . html. □ 

B. Maximum entropy 

This section introduces a random model for thermal systems and a concept of entropy as 
a measure of uncertainty. The following is based upon Huang [HU], Jaynes [J], Bamberg- 
Sternberg [B-S]. 

1. A probabilistic model 

A probabilistic model for thermal systems in equilibrium 
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We are given: 

(i) a triple 

(Q,F,n), 

consisting of a set Q, a a-algebra T of subsets of Q, and a nonnegative measure n defined 
on T . (We call (fl, T , n) the system, and n the reference measure. A typical point u e is 
a microstate.) 

(ii) the collection of all 7r-measurable functions 



such that 



(i; 



p : Q — > [0, oo), 



pdn = 1. 



(We call such a p the density of the microstate measure pdn) 
and 

(hi) a 7r-measurable function 

(2) X:fi^M m+1 , X= (X°,...,X m ). 

(We call each X k an observable.) 
Notation. 

£(X,p) = (X) = / n Xprf7T 

(3) = expected value of X, given the 

microstate distribution p. 



□ 



Physical interpretation. We think of as consisting of a huge number of microstates cu, 
each of which is equivalent to a precise, detailed microscopic description of some physical 
system, e.g. an exact description of the behavior of all the particles in a mole of gas. 

The main point is that fl is not observable physically. We instead model the state of the 
system by the probability measure 

pdn, 



where p satisfies (1). Thus if E e T , then 



Ie 



pdn 
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is the probability that the true (but unobserved) microstate is in E, given the density p. 
Our goal is to determine, or more precisely to estimate p, given certain macroscopic physical 
measurements. 

These we model using the observables X°, . . . , X m . 




X=(X°, . . ,X m ) 




R m+1 



Given p as above, we assume that we can physically measure the values 
(4) E(X,p) = (X)=X. 

Think of the point X = (X , . . . ,X m ) as lying in some region E C M m+1 , which we may 
interpret as the macroscopic state space. A point IeS thus corresponds to m + 1 physical 
measurements, presumably of extensive parameters as in Chapter I. To accord with the 
notation from §I.A, we will often write 



(5) 



E = (X°) 



□ 



The fundamental problem is this. Given the macroscopic measurements X = (X , X 1: . . . , X m ), 
there are generally many, many microstate distributions p satisfying (4). How do we deter- 
mine the "physically correct" distribution? 

2. Uncertainty 

To answer the question just posed, let us first consider the special case that 



(6) 



Q = {u>i, . . . , un} is a finite set, T = 2 n , 
and 7r is counting measure. 
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Then each distribution p as above corresponds to our assigning p{u)i) — Pi (i — 1, . . . , TV), 



where 

N 

(7) 0< Pi < 1 (i = l,...,N), Y,P* = 



1 



i=i 

Us. 



Thus is the probability of 

We propose now to find a function S = S(pi, . . . ,Pn) which somehow measures the 
uncertainty or disorder inherent in the probability distribution {pi, . . . ,pn}- Let us imagine 
S is defined for all TV — 1, 2, . . . and all TV-tuples {pi, . . . ,pn} as above. 
We will ordain these 

Axioms for S: 

A. Continuity 

For each TV, the mapping (pi, . . . ,p N ) \— > S(pi, . . . ,p^) is continuous. 

B. Monotonicity 

The mapping TVi— > 5 (-^, . . . , is monotonically increasing. 

C. Composition 

For each TV and each probability distribution (pi, . . . ,pn), set 

qi = Pi H hpfcu ■ ■ ■ , qj = Pk^ 1+ i + ---+p kj ,--- 

where 1 = k < k\ < k 2 < ■ ■ ■ < hu = TV. Then 

S(pi,...,Pn) = S(q 1: ...,q M ) 

+ E^i qjS(p kj . 1+ i/qj, ■ ■ -iPkJqj)- 



(8) 



Probabilistic interpretation 

The monotonicity rule B says that if all the points in Q = {u>i, . . . , u^} have equal 
probability ^, then there is more uncertainty the bigger TV is. The composition rule C 

applies if we think of subdividing f2 = Ui^i^j) where Qj = {uk^+i, ■ ■ ■ , u^}- Then 
qj is the probability of the event Qj. Further (8) says that the uncertainty inherent in 
the distribution {p±, . . . ,pn} on Q should equal the uncertainty of the induced probability 
distribution {q ± , . . . , q M } on . . . , Qm} plus the "average of the uncertainties within each 
Qj" . This last expression is the sum on j of qj, the probability of Qj, times S computed for 
the induced probability distribution {p kj l+1 /qj, . . . ,PkJqj} on Qj. If some qj = 0, we omit 
this term. □ 
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Lemma. The axioms imply S has the form 

N 

(9) S(p u ...,p N ) = -K^pilogpi 

i=i 

for some positive constant K. 

Proof. I. We follow Jaynes [J]. Suppose S satisfies Axioms A-C, and define 

/ \ 

1 1 

TV''"' N 



(10) 



A(N) := S 



Take 

and 
(11) 



Pi 



N 



\ n terms / 
(i = l,...,N), 



n 



Qi = ^ (J = 1,...,M), 



M 
3=1 



where {rij}^ are integers satisfying 
(12) 

Then (8) implies 

s(±, ...,^^S( qi ,..., qM ) + ^ qj S^,...,^ 
In terms of (10), this equality reads 
(13) 

Now select N of the form 

N = ML, 

and set rij — L for j — 1, . . . , M. Then (13) implies 

A(ML) = 5(^...,^)+Ej 1 ^(i,---,i) 
= A(M)+A(L). 



M 



A{N) = S( qi , ..., qM ) + J2 <U A (nj)- 

3=1 
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Thus in particular 
(14) 



A(N a ) = aA(N) for positive integers a, N. 



Axiom B implies then A(N) > (N = 2,...). 
2. We claim that in fact 



(15) 



A(N) = K\ogN (JV = 1,...) 
for some positive constant K. 



To prove this, let M, N be any integers greater than one. Given any large integer a k , choose 
the integer b k so that 



(16) 
Then 

and so 
(17) 



M hk < N ak < M bk+1 . 
b k log M < a k log N < (b k + 1) log M, 



log M < Ok < j 



1 \ log M 



log - 6 fc - V 6 fc ; log 
Now since i— > A(N) is increasing according to Axiom B, (16) and (14) imply 

b k A(M) < a k A(N) < (b k + l)A(M). 



Then, since A(M) > 0, 
(18) 



b k < A(AQ < & fc + l 



a k ~ A(M) a k 
Sending a k and thus b k — > oo, we conclude from (17), (18) that 

A(N) \ogN 



A(M) logM 



(M : N> 2). 



This identity implies A(N) = KlogN for some constant K, and necessarily K > in light 
of Axiom B. This proves (15). 

3. Now drop the assumption that rij — L (j — 1, . . . , M). We then deduce from (11)— (15) 
that 

S = A(N)-Zf =1 ^A(n 3 ) 

= ^(logiV-E^i^logn,) 

= -^EL^iog(^) 
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provided n±, . . . , um are nonnegative integers summing to N. In view of Axiom A, formula 
(9) follows. □ 

We henceforth agree to take K = k, Boltzmann's constant, this choice being suggested 
by the physical calculations in §V.A.2. Thus 



N 



S(pi, ...,p N ) = -k^pilogpi 



i=i 



provided < pi < 1 (i = 1, . . . , N), Yn=\Pi = 1 - 

Return now to the general probabilistic model in §1 for a thermal system in equilibrium. 
Motivated both by the above formula and our earlier study of Boltzmann's equation in V.A.2, 
we hereafter define 

(19) S(p) — -k plogpdix 

Jn 

to be the entropy of the microstate density p, with respect to the reference measure ir. We in- 
terpret S(p) as measuring the uncertainty or disorder inherent in the probability distribution 
pain. 

C. Maximizing uncertainty 

We can now provide an answer to the question posed at the end of §1, namely how to 
select the "physically correct" microstate distribution satisfying the macroscopic constraints 

(20) E(X k ,p)=X k (fc = 0,...,m)? 

Here is the idea: Since all we really know about p are these identities, 

we should select the distribution p which maximizes 
the uncertainty (= entropy) S{p), subject to the 
constraints (20). 

Remark. This is both a principle of physics (that we should seek maximum entropy config- 
urations (§I.C6)) and a principle of statistics (that we must employ unbiased estimators). 
See Jaynes [J] for a discussion of the latter. □ 

We analyze the foregoing entropy maximization principle by introducing the admissible 
class: 

(21) A — |p : — > [0, oo) | p is 7r-measurable, J pdn = 1, E(X,p) = x| , 
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where X = (X , . . . , X m ) is given. Recall that we write X : Q -> R m+1 , X = (X°, . . . , X m ). 
Theorem, (i) Assume there exist (3 G M m+1 and Z > sfic/i t/iat 

P-/3-X 

(22) a = — 



belongs to A. Then 

(23) 5(a) = max S(p). 

(ii) Any other maximizer of S(-) over A differs from a only on a set of n -measure zero. 
Remark. Observe 



(24) Z = f , 



Proof. 1. First note that 

i/)(x) := — h logx (x > 0) 



satisfies 



Hence 



1 1 f > if x > 1 

< x < 1. 



.„ , 1 1 f >0 if 
a; 2 i ( <0 ii 

^(x) > = 1 for all x > 0, 



and so 

(25) (f)(x):=x\ogx-x+l>0 (x > 0), 

with equality only for £ = 1. 

2. Define a by (20) and take p G A. Then 

(26) — p log p + p log a < a — p on f2, 
since this inequality is equivalent to 

*iog(>)-> + i = #(*)>o. 

In view of (25) then, (26) holds. 

3. Integrate (26) over fi: 

(27) — / p\ogpdn < — \ plogadiT. 

Jn Jn 



□ 
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But in light of (21): 

logo- = - ]ogZ - P-X, 

and so 

f n p log adn = — log Z — /3 ■ J n pX.dit 
= -logZ-P-X, 

since p G A. Since a G A as well, 

J n a log (Tc?7r = — log Z — (3 ■ X 
= JnP^gadn. 

Consequently (27) implies 

(28) S(p) < S(a). 

We have a strict inequality here unless we have equality in (26) 7r-a.e., and this in turn 
holds only if p = a 7r-a.e. □ 

D. Statistical mechanics 

1. Microcanonical distribution 

We consider in somewhat more detail first of all the case that there are no observables, 
in which case we deduce from Theorem 1 in §B that the entropy S(-) is maximized by the 
constant microstate distribution 

(1) * 4, 
where 

(2) Z = tt(Q). 

Thus each microstate is equally probable. This is the microcanonical distribution (a.k.a. 
microcanonical ensemble). 

Example 1. Let us take Q to be a finite set, 

Q = {UJ!, . . .,U N }, 

and take the reference measure ir to be counting measure. Then (1), (2) imply Z = N, 
a{ui) — ^ for % — 1, . . . , N. The entropy is 

S(a) = -kJ2? = iv(ui)log(o-(ui)) 
= k log N. 
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This formula is usually written 



(3) S = k\ogW 

where W = N = \Q\. □ 
Example 2. Assume we are given a smooth function 

H : R n -> R, 

called the Hamiltonian. Fix a number BeR and set 

n E = {xeR n \ H{x) = E}. 

We assume that Qe is a smooth, (n — l)-dimensional surface in R n . Consider now the "energy 
band" A s = {x G W 1 \ E - 5 < H(x) < E + 5} for small 5 > and note 



r-E+5 / p 
Je-8 \JfI 



E+5 ' dS , 7 



'{if=t} 

according to the Coarea Formula. (See [E-G].) It follows that 

|Aji r ds 

hm 



5^0 25 J Qe \DH\' 

assuming Qg is a smooth surface and \DH\ > on Qg. Here dS is (n — 1) -dimensional 
surface measure. Now take 

The entropy is then 
(4) %{E) = k\ogA{E). 



Physical interpretation. The Hamiltonian gives us the energy of each microstate. We 
are here assuming our system is thermally isolated, so that all attainable microstates lie on 
the energy surface {H = E}, where E is the macroscopic energy. We can, as in Chapter I, 
define the temperature T by 

1 _ OS 
f~dE' 

Remark. Notice that our choice dir = v^m ^ depends not only on the geometry of 
the level set {H = E}, but also on \DH\. Another plausible possibility would therefore be 
simply to take dn = dS. 
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The issue here is to understand the physical differences between taking the "hard" con- 
straint H = E versus the limit as 5 — > of the "softer" constraints E — 5 < H < E + 5. 

The expository paper [vK-L] by van Kampen and Lodder discusses this and some related 
issues. □ 

2. Canonical distribution 

Next we apply the results of §B to the case that we have one observable X° = H, where 

H : tt -> R 

is the Hamiltonian. As before we write E for the macroscopic energy: 

(5) E=(H). 

We now invoke Theorem 1 from §B to deduce that the entropy S(-) is maximized by the 
microstate distribution 

(6) a = —— for some f3 e R, 

Zj 

where 

(7) Z= [ e~ pH dir. 

Jq 

This is the canonical distribution (a.k.a. Gibbs' distribution, canonical ensemble). We 
assume the integral in (7) converges. 

Physical interpretation. We should in this context imagine our system as not being 
thermally isolated, but rather as being in thermal contact with a "heat reservoir" and so 
being held at a constant temperature T. In this setting energy can be transferred in and 
out of our system. Thus the energy level H of the various microstates is not constant (as 
in Example 2 in §1) but rather its average value (H) = E is determined, as we will see, by 
T. □ 

Example. Let us take f2 = M 3 , n to be Lebesgue measure, and 

H=^\v\ 2 (veM. 3 ). 

H is the kinetic energy of a particle with mass m > 0, velocity v. Then canonical distribution 
is then 

' -pi 



— e 2 
Z 
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But this is essentially the Boltzmann distribution (43) from §V.A.2, with the macroscopic 
velocity v = 0, the macroscopic particle density n = 1, and 

9 being the temperature. □ 
3. Thermodynamics 

We next show how to recover aspects of classical equilibrium thermodynamics (as in 
Chapter I) from the canonical distribution (6), (7). The point is that all the relevant infor- 
mation is encoded within 

(8) Z= I e-P H dTT. 

Jq 

We regard (8) as a formula for Z as a function of (3 and call Z the partition function. 
Remember H : Q — > R. 

Definitions of thermodynamic quantities in terms of f3 and Z. We define 

(i) the temperature T by the formula 

(9 > » = 

(ii) the energy 

(10) E=~(]ogZ), 

(iii) the entropy 

(11) S = k((3E + logZ), 

and 

(iv) the free energy 

(12) F = -^\ogZ. 

Note carefully: we regard (10)-(12) as defining E, S, F as functions of (5. 

We must check that these definitions are consistent with everything before: 
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Theorem, (i) We have 

(13) E=(H), 

the expected value of H being computed with respect to the canonical distribution < 
(ii) Furthermore 



n 



(14) S = S(a) = -k I alogadiT, 

and 



OS _ 1 
dE ~ f' 



(15) 

(iii) Finally, 

(16) F = E-ST 

and 

dF 

w - ~ s - 

Proof. I. Using the definition (10) we calculate 

= f n Hadn=(H). 

This is assertion (i) of the Theorem. 
2. We compute as well 

S(a) = —k o~ log o~ dn 

= -§ J n e-^ H (-PH -log Z)dn 
= k(3E + k log Z. 

From (11) we conclude that the identity (14) is valid. Furthermore 

(I)" 1 

= K E + ^i + * (logZ) )(i) 1 

= k(3 by (10) 

= £by(9). 



dS_ dS 
dE 8/3 
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3. Since (12) says logZ = -0F, formulas (9), (11) imply F = E-TS. This is (16). We 
next rewrite (12) to read: 

(18) / e-^d-K = e-? F 

Jn 

and so 



Differentiate with respect to f3, recalling that F is a function of (5: 



Thus (13), (18) imply 



OF 

(19) F-E + p— = 0. 



Now since (5 — 
(20) 

Therefore 
Then (19) says 



d(3 



d 1 d 



d/3 k/3 2 dT' 



dF_ _1_3F _ T dF_ 
^~dp = ~k(3df = ~ df' 



F=E+T w 

dF 



Owing to formula (16), we deduce S — — □ 
Remark. We can define as well the heat capacity 

(21) C v = k(3 2 ^ 2 {\ogZ). 
Thus (10), (20) say 

(22) Cy = §, 
consistently with classical thermodynamics. Now since 

E=(H) = ± [ He'^dn = [ He^^dir, 
Z Jn Jn 
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we have 



f(E- H)e^ F - H Un = 0. 
in 



Differentiate with respect to (5: 

<)!■: f .. ., .</ 

But F + = E - TS - ±% = E and so 



+ J^E-H)(f-H + e^'^d-K = 0. 



Rewriting, we obtain the formula 

(23) (E - H) 2 = kT 2 C v , 

the average on the left computed with respect to the canonical density. This is a probabilistic 
interpretation of the heat capacity, recording the variance of the microstate energy H from 
its macroscopic mean value E = (H). □ 

Remark. Finally we record the observations that the mapping 

(24) {3 i — ^ log Z — log f f e-P H dir 



is uniformly convex, 

(25) S = mm k(/3E + log Z), 
and 

(26) log Z = max ( -0E + j 

where in (25), (26) we regard S = S(E), Z = Z((3). 

Indeed, we just computed ^(logZ) = (E — H) 2 > (unless H = E). Consequently the 
minimum on the right hand side of (25) is attained at the unique f3 for which 

|(logZ) = -£, 

in accordance with (10). Thus (25) follows from (11). Formula (26) is dual to (25). 
Observe that (25), (26) imply: 

E i— > S is concave, 
P i— > log Z is convex. 

□ 
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CHAPTER 8: Probability and PDE 



This chapter introduces some further probabilistic viewpoints that employ entropy no- 
tions. These in turn give rise in certain settings to various PDE. 

A. Continuous time Markov chains 

We begin however with a simple system of linear ODE. This example both illustrates how 
entropy controls convergence to equilibrium for Markov chains and also partially answers the 
question left open in §VI.C (about the appearance of irreversibility in the diffusion limit). 

Continuous time Markov chain 

We are given 

(i) a finite set £ (called the state space) 

and 

(ii) a function 

p:[0,oo)xExE^[0, 1] 

such that 

(l) t»p(t,£,ri) isC 1 (£,77 en), 



(2) p(0,Z,v) = 8t(ri) 



1 V = Z 

otherwise, 



(3) l>(f,^) = l, 
and 

(4) p{t + s,£,r]) = ^2p(t,C,l)p(sn,v)- 

We call p a Markov transition function and (4) the Chapman-Kolmogorov formula. 

1. Generators and semigroups 
Definitions, (i) Define 

c:ExE^l 
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by 

(5) c(f , V) = jim 

(ii) We further write 



(6) d(Z, V ) 



if e = ^7- 



Remark. Owing to (2), (3), 

c(Z,v) >Oif£^77, ^(£,77) =0. 

»? 

Thus 

d(e,»7)>0. 

Definitions, (i) If / : E — > R, we define 

Lf : E -> R 
by [A/KO = J2r, c (C,v)f(v), or equivalents 

(7) [£/](fl = £d(^)(/fa)-/(0) «GS). 

»? 

We call L the generator of the Markov process. 

(ii) We define also the semigroup {S(t)} t > generated by L by 

(8) ls(t)fM) = Y.P^^)f(v)- 



□ 



Probabilistic interpretation. Think of a randomly jumping particle whose position at 
time t > is X(t) G E. Thus {X(i)} t > is a stochastic process and we may interpret p(t, £, 77) 
as the probability that X(t) = 77, given that X(0) = £. According to (2), (5) 

pit, i, 77) = ^(77) + te(£, 77) + o(t) as i -> 0; 

and so if £ 7^ 77, 0(^,77) zs £/ie rate of jumps/unit time from £ to 77. Furthermore 



[5(t)/]K) = £(/(i(t))|i(o) = e)), 
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the expected value of f(X(t)), given that X(0) = £. Owing to (4) {X(t)} t > is a continuous 
time Markov process, with generator L. □ 

Properties of S(t). For s,t > 0, we have 

5(0) = I, S(t + s)=S(t)S(s) = S(s)S(t), 



(9) 



(it 



(5(0/) = L(S(t)f) = S(t)(Lf), S(t)t = l, 



where 1 denotes the function identically equal to 1 on E. 

Definition. Let /ibea probability measure on S. We define then the probability measure 
S*(t)fx by requiring 

(10) / S(t)fdfi= [ fdS*{t)n (t>0) 

for each / : E — > R. We call {S , *(t)}i> the duai semigroup. 
Now fix a reference probability measure n on S, with 

7r(?7) > for all jjgS. 
Notation. Given a probability measure /x, write 

(11) ^ = ^^' 

so that 

(12) [S*(tWv) = p(V:t)n(rj) (rj E E, t > 0). 
Lemma. We /iawe 

(13) d J^=L*p on E x [0,oo), 

where L* is the adjoint of L with respect to ir. 

We call (13) the forward equation. 
Proof. Let us first note 

f^pfdn = bfdS*(t)p 
= kS(t)fdp. 



172 



Thus 

= ^S(t)Lfdp 
= J^LfdS*(t)i2 

This identity is valid for all / : E — > R and so (13) follows. 

2. Entropy production 
Definition. We say the probability measure ti is invariant provided 
(14) S*(t)7r = Tr forallt>0. 

Remark. It is easy to see that ir is invariant if and only if 



□ 



(15) 

or, equivalent ly, 
(16) 



Lfdn = 



S(t)fck = / fd<K 



□ 



for all / : E -> R. 

We wish to identify circumstances under S*(t)/i converges to an invariant measure as 
t — > oo. The key will be certain estimates about the rate of entropy production. 

Definition. Let n be a probability measure on E, with n(r}) > for each rj e S. If /x is 
another probability measure, we define the entropy of ji with respect to n to be 



(17) 
where 



H(n,n) = J plogprfvr, 



p = dn/dn. 

Remark. Since E is finite, (17) says 

(is) ^^) = E lo g(^y)^)- 
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Clearly (j, \— > H(/i, it) is continuous. □ 

Lemma. Let n be an invariant probability measure, with -n(rj) > for all r\ G X. Take /i to 
fre any probability measure. Then 

(19) jH{S\t)^)<0. 
Proof. 1. Write 

(f){x) := x log a; — x + 1 (x > 0). 

As noted in §VII, 

(20) 4> is convex, > for x > 0. 

2. Take any 7r and set p(-,t) = dS*(t)/i/dir. Assume p(-,t) > 0. Then 



| s §e log pdn + fxPtdn 
J E (L*p) log pdn + f s L*pdn. 



Now 



/ L*pd7T = [ g(Lt)d7T = 0, 



owing to (7). Thus 

f t H(S*(t)^7r) = JzpL(\ogp)dn 



= E,p(e,0(E^(e^)iog(g|))7r(0 

= -E €>f| d(e^)^log(^)7r(0^,0 

= -E^^^)0(f|)7r(Op(^t) 

+ E €> ,d(e^)(p(e,0-p(^*MO)- 



The last expression is 



Lp(-,t)dn = 0, 

'E 

since 7r is invariant. Since > 0, estimate (19) results. 

If p is not everywhere positive, we omit the sites where p = in the foregoing calculation. 

□ 
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Remark. We call 



(21) G(t) = d & V)<P (^|) *®P(V> *) ^ 

the rate of entropy production at time t. Then 

jH{S\t)n,Tt) = -G{t)<U. 



□ 



3. Convergence to equilibrium 
Definition. The Markov chain is called irreducible if 
(22) p(t,t,r))>0 (t >0, £,ri) G E. 

Remark. It is straightforward to show that the Markov chain is irreducible if and only if 
for each pair £, 77 e S, £ 7^ 77 there exists a "path" 

£ = 7o, 7i, • • • ,7m = V 

with 



rf(7i,7 i+ i)>0 (i = 0, . . . ,m - 1). 

Theorem. Assume the Markov chain is irreducible. Then 

(i) i/iere exists a unique invariant probability measure ir > 0, 
and 

(ii) for each probability measure fi, 

(23) Mm S*(t)/2 = 7i. 

Proof. 1. First we build ir. Fix any site £ G £ and write 

1 A* 

n ( t > r » = fJ P( s ^o,V)ds (t>0). 

Define then 

(24) 71(77) := lim 71(^,77) 



□ 
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the sequence tk — > oo selected so that the limit (24) exists for each rj e S. Clearly 

(25) 0<7r(77)<l, J>(*7) = 1. 

2. In addition, if h > the Chapman-Kolmogorov formula (4) implies 

E 7 (| /o Co,7)^)p(^,7, *7) = ] IoP(s + h^ ,ri)ds 

Let t = t k — > oo and recall (24): 

(26) $>( 7 )p(/i, 7,77) =77(77) fa e £)■ 

7 

Then (25) and the irreducibility condition (22) imply 

(27) 7rfa) > for each 77 e E. 
Next differentiate (26) with respect to ft, and set h = 0: 

7 

This identity implies 
for all / : E — > R and so 

7r is an invariant measure. 



3. Next fix any £ and define 

*«fa) 



1 if 77 = ^ 
if 77 ^ 



Then 

(28) p(U,0 = S*(O*€- 
According to the Lemma, 

(29) i 1— > H(p(t, £, •), 7r) is nonincreasing in t. 
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Now select any sequence t\ — > oo such that the limit 

(30) 1/(77) := 1™ p(U,£, 77) 

exists for each 77 e E. Then (29) implies 

inft> iy(p(t, f , •),*") = lim t _ >0O f/'(p(t,^,-),7r) 

= limt^oo//^^,^,-),^) 

= //(i/,7T). 

Also 

S*(t)u = hmt^FWfafc 
= limt^oo S(t + ti)5z 
= lim^oo^t + •). 

Consequently 

H(S*(t)v,ir) = lim t ^ 0O #(p(* + ^£,-),7r) 
= F(z/,tt). 

Thus 

(31) t 1— > H(S*(t)i/,n) is constant. 
Set 

p(-,t) = dS*(t)u/dn > 0. 
Then (21), (31) imply the rate of entropy production 

G(f ) = £ d& (^|) MOpfo, = 

for t > 0. Since 7r, p > 0, we have 

J ^(iS) = ° and 80 p(U) = p(v,t) 

{ for each £, 77 with d(£, 77) > 0. 
The Remark after (22) thus implies p(-,t) is constant and so 

S*(t)u = 7T for all t > 0. 

So v = 7r and therefore 
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for each (eE. Assertion (23) follows. □ 
B. Large deviations 

1. Thermodynamic limits 

We turn next to the theory of large deviations, which will provide links between certain 
limit problems in probability, statistical mechanics, and various linear and nonlinear PDE. 

Physical motivation. To motivate the central issues, let us first recall from Chapter VII 
these formulas for the canonical distribution: 



F = ~logZ, Z = [ e~ m d-K. 
P Jn 



Now it is most often the case in statistical mechanics that we are interested in a sequence 
of free energies and partition functions: 

(1) F N = ~logZ N , Z N = [ e^ HN d7r N (N = l,2,...). 

P Jn N 

Typically (1) represents the free energy and partition function for a system of N interacting 
particles (described by microstates in the system (Qn, J~n, ttjv), with Hamiltonian Hn : 
Vt N — > R). We often wish to compute the limit as N — > oo of the free energy per particle: 

Understanding in various models the behavior of the mapping (3 i— > f((3) is a central problem 
in statistical mechanics; see for instance Thompson [T, §3.6]. We call (2) a thermodynamic 
limit. 

To help us understand the mathematical and physical issues here, let us rewrite 



Z N = I e-P H "dir N = I e~ N ^dP N , 

J S7 j\f J S 



where the state space E is M 1 and P N is the distribution of -hH N on E; that is, 



7V J 

(3) Pat(-oo, 7] = ir N e Q N \ Hn ^ UJ " > < 7 

for 7GI 1 . Setting 

1 
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we recast (2) as 

(4) /(/J) = -±limetogQf e-£dP e 
Let us next suppose that for small e > 0: 

(5) dp £ « e - //£ rfg 

in some unspecified sense, where / : £ — > M, Q is a reference measure on S. Inserting (5) 
into (4), we may expect that 



£^0 



lhn- log ( l < ' '~ <IQj =sup(-/3£ -/(£))■ 



Consequently — supposing the foregoing computations are somehow legitimate — we deduce 

(6) m = (^r). 

What is the physical meaning of this formula? First note that the energy per particle is 

jf E N = jf(H N ) 



Hence as N — > oo, we may expect 



( 7 ) ~~j\7 > e ' 



N 



where 



(8) (3e + 1(e) = inm + 1(0)- 

Let us therefore interpret e as the energy in the thermodynamic limit. 
Next recall from §VIII.C.3 the formula 

logZ = sup (-0E + ^ 
e \ k 
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and remember 

F = -ilogZ. 

Thus 

S 



(9) -(3F = sup -(3E + 



E 



k 



On the other hand, (6) says 



(10) -/?/ = sup(-#-/(0)- 

In view of (8)-(10) we might then conjecture that 

— I T** 
k) - 1 ' 

S denoting the entropy and * the Legendre transform. As S is presumably concave, we 
deduce then 

(11) S = -kP*. 
Should / also be convex, then (11) reduces to 

(12) S = -kI. 

In this case our supposition (5), which we now rewrite as 

(13) dP £ w e^dQ, 



says that entropy S controls the asymptotics of {-P £ }o< e <i in the thermodynamic limit and 
that the most likely states for small e > are those which maximize the entropy. 

2. Basic theory 

We now follow Donsker-Varadhan (see e.g. Varadhan [V], Dembo-Zeitouni [D-Z], etc.) 
and provide a general probabilistic framework within which to understand and generalize 
the foregoing heuristics. 

a. Rate functions 

Notation. Hereafter 

E denotes a separable, complete, metric space, 
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and 

{P £ }o<e<i is a family of Borel probability measures on E. 

□ 

Definition. We say that {-P e }o< £ <i satisfies the large deviation principle with rate function 
I provided 

(i) i" : E — > [0, oo] is lower semicontinuous, / ^ +00, 

(ii) for each / e M, 

(14) the set {£ G E | < /(£) < 1} is compact, 

(iii) for each closed set CCS, 

(15) limsupelogP e (C) < -inf/ 

e^O c 

and 

(iv) for each open set U CE, 

(16) liminfelogP e (f/) > -inf/. 

Remarks, (i) If E is a Borel subset of E for which 

inf / = inf / = inf /, 

E° E E 

then 

lime log PJE) = - inf J. 

This gives a precise meaning to the heuristic (5) from the previous section (without the 
unnecessary introduction of the reference measure). 

(ii) The rate function I is called the entropy function in Ellis [EL] . This book contains 
clear explanations of the connections with statistical mechanics. 

(iii) It is sometimes convenient to consider instead of e — > an index n — > 00. Thus we 
say {P n }^ = i satisfies the large deviation principle with rate function / if 

lim sup^oo J log P n (C) < - mi c I (C closed) 

(17) { and 

liminf^oo Mog P n (U) > - iniu I (U open). 



□ 



b. Asymptotic evaluations of integrals 
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We now make clearer the connections between large deviation theory and the heuristics 
in§l. 

Theorem 1. Let {-P e }o< £ <i satisfy the large deviation principle with rate function I. Let 

g : £ -> R 

6e bounded, continuous. Then 

(18) limelogl / e^dP £ ) — sup(g-I). 

Vis / s 

Proof. 1. Fix 5 > 0. We write 

AT 

1=1 

where each Cj is closed and the oscillation of g on Cj is less than or equal to 5 (« = 1, . . . , N). 
(Assuming without loss that g > 0, we can for instance take 

d = {£ e S | (i - 1)5 < g(0 < iS}.) 

Then 



where 

^ = inf# (i = 1,...,JV). 

Thus 

log(/ E e?dP e ) < log (Vmaxi< i < Ar e £l ^P £ (C' i )) 

= log N + maxi^v [(^) + log P e (Ci)] , 

and so (15) implies 

lim sup e ^ £ log (/ E ef dP £ ) < max^^ [(& + 5) - inf Ci /] 

< maxi<j<AT sup c . (g - I) + 5 
= sup s (# - I) + S. 

Consequently 

(19) limsupelogl / e?dP £ ) <sup(g — I). 

e^O Vis / S 
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2. Again fix 5 > 0. There exists r] G £ with 

(20) 2(77) - 7(77) > sup(^ -/)--. 

Since (7 is continuous, there exists an open neighborhood U of r\ such that 

(21) g(0 > g^) - 5 - for £ G U. 



Then 



Hence 



lim inf e ^ e log (J s e ? dP £ ) > lim inf £ _ £ log (/^ e * dP £ ) 

> liminf^oelog^e^dP^ by (21) 

= #(?7) - f + I™ inf e ^ Q eP £ (U) 

> 2(77) - | -inf ^ 7 by (16) 

> 0(77) - 7(77) - f 

> sup s ((?-7)-5by (20). 

lim inf £ log I / e?dP £ ) > sup(g — 7). 
Vis / s 



This bound and (19) finish the proof of (18). □ 

We will require in the next section the converse statement. 

Theorem 2. Assume the limit (18) holds for each bounded, Lip schitz function g : £ — ► M., 
where I : £ — > [0, 00] is /ower semicontinuous, I ^ +00. Suppose also the sets {0 < 7 < /} 
are compact for each I G R. 

TTien {P £ }o< e <i satisfies the large deviation principle with rate function I. 

Proof. We must prove (15), (16) 
1. Let C C £ be closed and set 

(22) # m (£) = max(-m, -m dist(£, C)) (ra = l, ...). 

Then g rn is bounded, Lipschitz and 



elog^y e 9m/£ dP £ ^j > e\og P £ (C). 



Thus 

lim sup e ^ £ logP e (C) < lim £ _ >0 elog(/ s e^/ e c iP £ ) 

= sup s (5f m -7). 
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Let m — > oo, noting from (21) that 

sup(5f m - I) -> sup(-J) = - inf J, 

s c* c 

since C is closed and I is lower semicontinuous. The limit (15) is proved. 
2. Next suppose f/CS open and take 

C fe = jeeC/| dist(e,E-C/)>i 

Then C £/, CV is closed (k — 1, . . . ). Define 



Note that 
Then 

Thus 



-m < g m < 0, g m = on C k , g m = -m on E - (7. 

elog (f s e 9m / £ dP £ ) < s\og(P £ (U) + e~ m / £ P £ (Z-U)) 
< e\og(P £ (U) + e- m l £ ). 



sup Cfc (-J) < sup s (# m -J) 
(23) = \im £ ^e\og{^e^l £ dP £ ) 

< liminf e ^ £log(P £ (f/) + e~ m l £ ). 



But 



and so 



e\og(P e (U) + e~ m / £ ) < £\og(2max(P £ (U),e~ rn / £ ) 

= e\og2 + max(e log P £ (U),—m), 



liminf e \og{P e {U) + e m ^ £ ) < max (liminf e\og P £ (U), — m ) . 
Combining this calculation with (23) and sending m — > oo, we deduce 

— inf I < lim inf e log P £ (U) . 

Cfe £^0 

Since U = }J™ =1 C k , the limit (16) follows. □ 

C. Cramer's Theorem 

In this section we illustrate the use of PDE methods by presenting an unusual proof, due 
to R. Jensen, of Cramer's Theorem, characterizing large deviations for sums of i.i.d. random 
variables. 



184 



More precisely, take (Q, J 7 , n) to be a probability space and suppose 

Y fc : n -> R m (k = 1, . . . ) 
are independent, identically distributed 
random variables. 



We write Y = Y 1 and assume as well that 
(2) 



the exponential generating function Z = E(e p ' Y ) 
is finite for each p e M m , 



where E(-) denotes expected value. Thus 

Z = I e p - Y dn. 



n 



We turn attention now to the partial sums 
(3) S n = ^ + --- + Y * 



n 

and their distributions P n on S = IR m (n — 1, . . . ). 
Next define 

(4) F:=logZ, 
that is, 



(5) 



F(p) = \ogE(e p - Y ) = log (^J e pY d7^j 



We introduce also the Legendre transform of F: 

(6) L(q)= sup(p.g-F(p)) (?eR m ) 



which turns out to be the rate function: 

Theorem. The probability measures {P n }^ 1 satisfy the large deviation principle with rate 
function /(•) = L(-). 

Remark. By the Law of Large Numbers 

S n — > E(Y) =: y a.s. as n — > oo. 
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As we will compute in the following proof, 

DF(0) = y 

and so 

DL(y) = 0, 

provided L is smooth at y. Hence x \— > L has its minimum at y, and in 

L(y) = 

L(x) > (x ^ y) 



Take a Borel set E. Assuming 



we deduce 



inf L = inf L = inf L, 

E° E E 



P n (E) = e n(-ini E L+o(l))_ 

So if y ^ E, P n (E) — > exponentially fast as n — > oo. 
Proof. 1. Write Y = (Y 1 , . . . , Y m ). Then for 1 < k, I < m: 

dF E(Y k eP^) 



dp k E{eP^) ' 

d 2 F E(Y k Y l evV) E(Y k e p ' v )E(Y l e p ' v ) 



dpkdpi 



E(ePY) E(eP' Y ) 2 



Thus if f e 



£ , ((Y-g) 2 eP' Y )E(eP- Y )-g((Y-QeP- Y ) 2 



£(eP' Y ) 2 



> o, 



since 

Hence 
(7) 

Clearly also 
(8) 



E((Y ■ 0e p ' Y ) < E{{Y ■ 2 e p ' Y ) 1/2 E{e p - Y ) 1/2 . 
p i— > F{p) is smooth, convex. 

F(0) = log£(e°) = 0. 



Define L by the Legendre transformation (6). Then 

L(q) = sup(p-q-F(p))>-F(0) = 

v 
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for all q, and so 

(9) 

In addition 



L : R m — > [0, oo] is convex, lower 
semicontinuous. 



lim — — — = oo 

|<z|-oo g 



(cf. [El, §111.3]), and thus for each I G R, 

(10) the set {g G M m | < L(g) < /} is compact. 

2. Next take g : M m — >■ ffi. to be bounded, Lipschitz. We intend to prove 

(11) lim - log (/ e n9 dP n j = sup(# - L), 

where P n is the distribution of S„ = Yl+ '^ +Y " on R m . The idea is to associate somehow the 
left hand side of (11) with the unique viscosity solution of the Hamilton- Jacob i PDE 



(12) 



u t - F(Du) = in R m x (0, oo) 

u = g onl m x{t = 0}. 



The right hand side of (11) will appear when we invoke the Hopf-Lax formula for the solution 
of (12). 

To carry out this program, we fix any point x G M, m and then write 



tk — k/n (k = 0, . . . ). 



We define 



(13) w n (x,t k ) := E ( h n [ — — ^L^± _|_ x 



n 



where 

(14) h n := e n °. 



3. We first claim: 



(15) w n (x, t k+ i) = E [ w n ( x H — , t k ) ) (A; = 0, . . . ). 
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This identity is valid since the random variables {Yk}'£L 1 are independent. Indeed, 

Wn (x,t k+1 ) = £(/^( Yl+ ; +Yfc + x ^ + x)) 

= El E (h n (^±^ + ^ + X )\Y k+1 )) 

the last equality holding by independence. More precisely, we used here the formula 

E(<f>(X,Y)\Y)=rl>(Y) a.s, 
where X, Y are independent random variables, is continuous, bounded, and 

See, e.g., Breiman [BN, §4.2]. 
4. Next define 

(16) u n (x,t k ) := -\ogw n (x,t k ) 

Th 

for n — 1, . . . , k — 0, We assert next that 

(17) ||Un||l,°° < \\9\\l°°, \\Du n \\ L oo < H^llioo, 

D as usual denoting the gradient in the spatial variable x. Let us check (17) by first noting 
from (13), (14) that 

IKIU- < \\h n \\ Lao = e n \^ . 

The first inequality in (17) follows. Now fix a time t k = K Then for a.e. x G W 71 we may 
compute from (13), (14) that 

Dw n (x,t k ) = E(Dh n ( Y ^ n +Y " +x)) 

= nE (Dgh n ( x ^ +Yk + x)) . 



Consequently 
(18) 

Recalling (16) we deduce the second inequality in (17). 



Dw n \ < n\\Dg\\ L ~E(h n ( ^ + ~ n +Y * + x)) 
= n||D^||Loo«; n . 
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Next take a point x G M m and compute: 

w n (x,t k+1 ) = E (e 

< E I e rag ( Yl+ n +Yfc +*)+\\Dg\\ L ™\Y k+1 \ 



/ Y lH HY fc + 1 \ 



£ ^ e "g( Yl+ n +Yfc +^)^ £ ( e ||Dff|Uoc|Y t+1 |) 



by independence. Thus 

(19) w n (x,t k+1 ) < w n (x,t k )E ( e ll^ll^l Y l) 

for Y = Yx, as the {Y fe }^ 1 are identically distributed. Assumption (2) implies 

£( e II^IU~|Y|) = . e C <QO _ 

Therefore (16), (19) imply: 

u n (x,t k+1 ) -u n (x,t k ) < -C, 

and a similar calculation verifies that 

u n (x,t k+1 ) -u n (x,t k ) > --C. 

n 

Consequently 

(20) Ma:,*fc)-«n(^*i)l <C|**-*i| (M > !)■ 

5. Extend u n (x,t) to be linear in t for t G [ifc,ifc+i] (k — 0, . . .). Then estimates (17), 

(20) imply there exists a sequence n r — > 00 such that 

(21) -u„ r — > -u locally uniformly in R m x [0, 00). 

Obviously u = g on IR m x {i = 0}. We assert as well that u is a viscosity solution of the 
PDE 

(22) u t -F(Du) =0 inM m x (0,oo). 

To verify this, we recall the relevant definitions from Chapter VI, take any v G C 2 (W m x 
(0, 00)) and suppose 

u — v has a strict local maximum at a point (x , t ). 
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We must prove: 

(23) v t (x o ,t o )-F(Dv(x o ,t o ))<0. 

We may also assume, upon redefining v outside some neighborhood of (xo, to) if necessary, 
that u(x ,t ) = v(x ,t ), 

(24) sup \v,Dv,D 2 v\ < oo, 

and v > sup(« n ) except in some region near (xo,to). In view of (21) we can find for n = n r 
points (x n ,t fc J, t kn = !f, such that 

(25) u n (x n ,t kn ) - v(x n ,t kn ) = max [u n (x, t k ) - v(x, t k )\ 

x£K m ,k=0,... 

and 

(26) (x n ,t kn ) -> (x ,t ) as n = n r -> oo. 
Write 

(27) a n := u n (x n ,t kn ) - v(x n ,t kn ). 
Then for n = n r : 

e n(a n +v(x n ,t kn )) _ e nu n (x n ,t kn ) 

= w n (x n ,t kn ) by (16) 

= E( Wn (a; n + ^,t fen _ 1 )) by (15) 

= E ^e nu ^ Xn+ ^' tk -- l )^j by (16) 



< £ e 



Y 



->*fcn-l)) 



the last inequality holding according to (25), (27). Thus 

e nv(x n ,t kn ) < ^ e nv( K x n + ^,t kn _ 1 ) ^ 

for n = n r . Now 

/ Y \ Y 

U ( + — , t kn -! J = V(X„, t kn -i) + Dv{X n , t fen _i) ■ — + Pn, 
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where 

/ Y \ Y 

(28) n := v ( x n + —,t kn -i I - v(x n ,^ n _i) - Dv(x n ,t kn _ l ) ■ — . 



Thus 
and hence 



g rai;(x„,tk„) < e nv n {x n ,t kn -{) ^ e Dv(x n ,t fen _i)-Y+ri/3 n ^ 



(29) vjXn^kn) V(x n ,tk„-l) ^ j og ^ ^(Wfc^O'Y+n/^ _ 

Now (24), (27) imply 

lim n/3 n = a.s., 

and furthermore 

| e Du-Y+n/?„| < e C|Y|^ 

Our assumption (2) implies also that E(e c ' Y ') < oo. We consequently may invoke the 
Dominated Convergence Theorem and pass to limits as n = n r — > oo: 

Vt(x ,t ) < logE ( e Dv(x ,to>Y) 

= F(Dv(x ,t )). 

This is (23), and the reverse inequality likewise holds should u—v have a strict local minimum 
at a point (x , to). We have therefore proved u is a viscosity solution of (22). Since u = g on 
]R m x {t = 0}, we conclude that u is the unique viscosity solution of the initial value problem 
(12). In particular u n — > it. 

6. We next transform (12) into a different form, by noting that u = —u is the unique 
viscosity solution of 



(30) 



u t + F(Du) = inR m x (0, oo) 
u = g on R m x {t = 0} 



for 

(31) ~g = -g, F(p) = F(-p). 

Indeed if u — v has a local maximum at (x ,t ), then u — v has a local minimum, where 
v = —v. Thus 

(32) v t (x , t ) - F(Dv(x , t )) > 0, 
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since u is a v is viscosity solution of (22). But Dv = —Dv, v t = —v t . Consequently (32) says 

Vt(x ,t ) + F(Dv(x ,to)) <0. 

The reverse inequality obtains if u — v has a local minimum at (xo,to). This proves (30). 
According to the Hopf-Lax formula from §V.B: 

(33) fiCx, t) = mf |«i (^T^) + ^ (W) } ' 

where L is the Legendre transform of the convex function F. But then 

L(q) = sup(p • q - F(p)) 
p 

= sup(p • q - F(-p)) 
v 

= sup(p • (-q) - F(p)) 
p 

= L(-q). 



Therefore 



u(x,t) = —u(x,t) 



T {- tL \-r) ~ ~ 9{v) 

suplgiy) - tL 



In particular 



(35) 



(34) u(0,l)=sup{g(y)-L(y)}. 

y 

But 

Un(0, 1) = MogW n (0,t n ) 

= \\ogE (e n9 ^) 

As u n (0, 1) -> u(0, 1), (34) and (35) confirm the limit (11). 

The second theorem in §B thus implies that / = L is the rate function for {P n }^ =1 . 

□ 

Remark. This proof illustrates the vague belief that rate functions, interpreted as functions 
of appropriate parameters, are viscosity solutions of Hamilton-Jacobi type nonlinear PDE. 
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The general validity of this principle is unclear, but there are certainly many instances in 
the research literature. See for instance the next section of these notes, and look also in the 
book by Freidlin-Wentzell [F-W] . 

If we accept from §B.l the identification of rate functions and entropy (up to a minus 
sign and Boltzmann's constant), then the foregoing provides us with a quite new interplay 
between entropy ideas and nonlinear PDE. □ 

D. Small noise in dynamical systems 

In this last section we discuss another PDE approach to a large deviations problem, this 
involving the small noise asymptotics of stochastic ODE. 

1. Stochastic differential equations 

We rapidly recount in this subsection the rudiments of stochastic ODE theory: see, e.g., 
Arnold [A], Freidlin [FR] or Oksendal [OK] for more. 

Notation, (i) (Q, J 7 , it) is a probability space. 

(ii) {W(t)}i> is a m-dimensional Wiener process (a.k.a. Brownian motion) defined on 
(£7, J 7 , 7r). We write 

W(t) = (W 1 (t),...,W m (t)). 

(iii) b : W 1 -> R n , b = (61, . . . , b n ) and B : W l -> M" xm , B = ((6 -)) are given Lipschitz 
functions. 

(iv) X is a M. n - valued random variable defined on (f2,jF, 7r). 

(v) !F(t) = <j(X o ,W(s)(0 < s < t)), the smallest a-algebra with respect to which X 
and Wfs) for < s < t are measurable. □ 



We intend to study the stochastic differential equation 

(1) 



dX(t) = b(X(t))dt + B(X(t))dW(t) (t > 0) 
X(0) = Xo 



for the unknown IR n -valued stochastic process {X(t)} t >o defined on (fi,jF, 7r). 

Remarks, (i) We say {X(t)} t > solves (1) provided this process is progressively measurable 
with respect to {^t}t>o an d 



(2) 



X(t) = X + / b(X(s))ds+ f B(X(s))-dW(s) 
Jo Jo 



a.s., for each time t > 0. The last term on the right is an ltd stochastic integral, defined for 
instance in [A], [FR], etc. 
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(ii) We may heuristically rewrite (1) to read 

(3) 



X(t) = b(X(t)) + B(X(t)).£(t) (f>0), 
X(0) = X 



where • = 4i and 

at 



dW(t) 

(4) "£(0 — — ; — — m-dimensional white noise." 

at 

(iii) If we additionally assume that X is independent of {W(t)} t > and -E(|X | 2 ) < oo, 
then there exists a unique solution {X(t)} t > of (1), such that 

(5) E ^jf \X(t)\ 2 d?J < oo 

for each time T > 0. "Unique" here means that if {X(t)} t > is another process solving (1) 
and satisfying an estimate like (5), then 

(6) vr(X(t) = X(t) for all < t < T) = 1 

for each T > 0. Furthermore, the sample paths t h- > are continuous, with probability 
one. 

2. Ito's formula, elliptic PDE 

Solutions of (1) are connected to solutions of certain linear elliptic PDE of second order. 
The key is ltd 's chain rule, which states that if u : R™ — > R is a C 2 -function, then 

d«(X(0) = D«(X(t)) • dX(f) 
1 J + |A(X(*)) : D 2 u(Mt))dt 1 " J ' 

where A : R n -> M nxn is defined by 

A = BB T . 
Remarks, (i) If we write A = ((a y )), then 

n 

(8) ^ a y 6^ > (£ G R n ). 
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(ii) Formula (7) means 

«(X(t)) = «(X(0)) 
(9) + JoEtMMs))u Xl (Ms)) + lJ2l j=1 a ij (Ms))u XiXj (X(s))ds 

+ Si EIU ELi Mx(*)K (x( s ))dw k (s) 

for each time t > 0. 

Next assume w solves the PDE 4 



□ 



(10) 



\ Yaj=i o u «Jiiij + Z)"=i 6 < u *i = in £/ 



u = g on 



where C/ C W 1 is a bounded, connected open set with smooth boundary, and g : dU 
given. In view of (8) this is a (possibly degenerate) elliptic PDE. 



is 




Fix a point x G U and let {X.(t)} t >o solve the stochastic DE 



;n) 



dX(t) = b(X(t))dt + B(X(t))dW(t) (t > 0) 
X(0) = x. 



Define also the hitting time 

(12) r x := min{t > | X(t) £ dU}. 

Assume, as will be the case in §3,4 following, that r x < oo a.s. We apply Ito's formula, with 
u a solution of (10) and the random variable r x replacing t: 



u(X(t x ))=u(X(0)) 



Du ■ B dW. 



o 



4 Note that there is no minus sign in front of the term involving the second derivatives: this differs from 
the convention in [El, Chapter VI]. 
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But X(0) = x and u(X(t x )) = g(X(r x )). Thus 

u(x) = g(X(r x )) - I X Du-B dW. 
Jo 

We take expected values, and recall from [A], [FR], etc. that 

e(^JJ Du- BdW^j = 0, 

to deduce this stochastic representation formula for u: 

(13) u(x) = E(g(X(r x ))) (x G U). 
Note that X and r x here depend on x. 

3. An exit problem 

We hereafter assume the uniform ellipticity condition 

n 

(14) Yl " U{r ^> ^ e \z\ 2 

and suppose also that bi, a lJ (1 < i,j < n) are smooth, 
a. Small noise asymptotics 
Take e > 0. We rescale the noise term in (11): 



(15) 



dX £ {t) = h(X £ (t))dt + eB(X £ (t)) ■ dW(t) (t > 0) 
X £ (0) = x. 



Now as e — > 0, we can expect the random trajectories t \— > X s (t) to converge somehow to 
the deterministic trajectories t \— > x(t), where 



(16) 



x(t) = b(x(t)) (t > 0) 
x(0) = x. 



We are therefore interpreting (15) as modeling the dynamics of a particle moving with 
velocity v = b plus a small noise term. What happens when e — > 0? 

This problem fits into the large deviations framework. We take E = C([0,T];R m ) for 
some T > and write P e to denote the distribution of the process X e (-) on E. Freidlin 
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and Wentzell have shown that {-P £ }o<e<i satisfying the large deviation principle with a rate 
function /[•], defined this way: 



(17) 



2 JO ^ 



n 



([0,T];R n ) 



+oo 



otherwise. 



Here ((a lj )) = A' 1 is the inverse of the matrix A = DD T and iJ 1 ([0, T]; M n ) denotes the 
Sobolev space of mappings from [0, T] — > W 1 which are absolutely continuous, with square 
integrable derivatives. We write y(-) = (yi(-), ■ ■ ■ ,y n (-)). 

b. Perturbations against the flow 

We present now a PDE method for deriving an interesting special case of the aforemen- 
tioned large derivation result, and in particular demonstrate how /[•] above arises. We follow 
Fleming [FL] and [E-I]. 

To set up this problem, take U C M. n as above, fix a point x e U, and let {X e (t)} t >o solve 
the stochastic ODE (15). We now also select a smooth, relatively open subregion T C dU 
and ask: 



This is in general a very difficult problem, and so we turn attention to a special case, by 
hypothesizing concerning the vector field b that 



Condition (19) says that it requires an infinite amount of "energy" for a curve y(-) to resist 
being swept along with the flow x(-) determined by b, staying within U for all times t > 0. 



(18) 




(19) 



ify(Oefff oc ([0,oo);R") and 
y(f) G U for all t > 0, then 

J oo |y(t)-b(y(t))| 2 ^ = +oo. 
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flow lines of 
the ODE x=b(x) 



a random trajectory which 
exits U against the 
deterministic flow 




Intuitively we expect that for small e > 0, the overwhelming majority of the sample paths 
of X £ (-) will stay close to x(-) and so be swept out of dU in finite time. If, on the other 
hand we take for T a smooth "window" within dU lying upstream from x, the probability 
that a sample path of X e (-) will move against the flow and so exit U through T should be 
very small. 

Notation, (i) 



(20) 

(ii) 
(21) 

Then 

(22) 

But according to §b, u 
(23) 



u £ (x) = probability that X e (-) first exits 
dU through T 

= 7r(x e (r x ) e r). 



9 = Xr 



1 on r 

on dU-T. 



□ 



u £ {x) = E(g(X e (T x ))) (x e U). 
solves the boundary value problem 



ur 



= 1 on r 

= on dU - f . 
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We are interested in the asymptotic behavior of the function u £ as e — > 0. 
Theorem. Assume U is connected. We then have 

w(x)-\-o(l) 

(24) u £ (x) = e ? as e -> 0, 
uniformly on compact subsets of U UT, where 

(25) «,(*) := inf { ^ jT £ ^'(y(*))(y^) - fc(y(*)))(y;(*) - My(*)))*>} , 

£/ie infimum taken among curves in the admissible class 
(26) 

.A = {y(-) G 4 QC ([0, oo); R") | y(t) e U for < t < r, y(r) G T if r < oo}. 

Proof (Outline). 1. We introduce a rescaled version of the log transform from Chapter IV, 
by setting 

(27) w £ (x) := -e 2 \ogu £ (x) (x G U). 

According to the Strong Maximum Principle, 

< u £ (x) < 1 in U 
and so the definition (27) makes sense, with 

w £ > in U. 

We compute: 



w £ = — £ 2 — 



M e (« £ ) 2 



Thus our PDE (23) becomes 
(28) 

-^E«=ia y <* J + |Eo-=ia y <^-E^i6i< = inf/ 

w e = on r 
w e -> cx) at 917 - f . 

2. We intend to estimate |-Dt> £ | on compact subsets of U U T, as in §IV.A.2. For this let 
us first differentiate PDE: 

2 n n n 

i,j=l ij'=l i=l 
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where the remainder term Ri satisfies the estimate 

\Ri\ < C(e 2 \D 2 w £ \ + \Dw £ \ 2 + l). 

Now set 

(30) 7 := \Dw £ \ 2 , 



so that 



Thus 



Ixi — 2 Ejfc=l W X k W X k Xi 



e 

^k^j 



' 2 Ej,j=l a 3 lx l x j Ei=l bilxi 



(3i) = 2 ELi < (-£ E«=i - EIU 

ELi E" J= i "''«•,>.,; 



Now 



V V ,. „• .. > 9\D 2 w £ \ 2 . 

/ J / J X k Jy X XfcJ^J I I 

k=l i,j=l 

This inequality and (29) imply: 

2 n n 

(32) -- J* lxiX . - $>7^ ^ -e 2 9\D 2 w £ \ 2 + R 2 , 

i,j=l i=l 

where 

\M < C(e 2 \D 2 w £ \\Dw £ \ + \Dw £ \ 3 + 1) 

= ^|D 2 ^| 2 + C( 7 3/2 + l). 
Consequently (32) yields the inequality: 



Or 2 r 2 n 

(33) ^-pvr - \ J2 aij ^ -E^ ^ ^ 3/2 + !)■ 

i,j=l i=l 



Now the PDE (28) implies 



7 < C(e 2 |£> 2 iy e | + \Dw e \) 
= C(e 2 \D 2 w £ \+^ 2 ) 
< C(e 2 \D 2 w £ \ + 1) + |, 
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and so 

7 < C{e 2 \D 2 w £ \ + 1). 
This inequality and (33) give us the estimate: 

4 n 

(34) <x 7 2 - £ -J2 di lxiXj < e 2 C(\D^\ + 7 3 / 2 ) + C, 

for some a > 0. 

3. We employ this differential inequality to estimate 7. Take any subregion V CC U U T 
and select then a smooth cutoff function ( such that 

< C < 1, C = 1 on V, 
( = near dU - T. 




Write 

(35) rj := C 4 7 

and compute 



(36) 



VxiXj = C IxiXj + 4£ (CrwT^i ~l~ C^iTa;,') "I" 4(C C^i)a;,'7- 



Select a point rr G U where 77 attains its maximum. Consider first the case that Xo G £/, 
C(x ) > 0. Then 

Dr](x ) = 0, D 2 r](x ) < 0. 
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Owing to (27) 

(37) CD-i = -47DC at x , 



and also 



Thus at x : 



where 



^ a tJ r] XiXj > at x . 



i,j=l ij=l 



|i2s| < e 4 C(C s |I>y|+C 2 7) 



Therefore (25) implies 



<rCV < e 2 C 4 C(|£>7l+7 3/3 )+e 4 CC 2 7 + C 



< ^7 2 + C 



Thus we can estimate 77 = C 4 7 at xq and so bound \Dw £ (xq)\. 

4. If on the other hand x G ((x ) > 0, then we note u £ = 1 on <9£7 near x . In this 
case we employ a standard barrier argument to obtain the estimate 

\Du £ (x )\ < - 

from which it follows that 

(38) |^ M= ^g>l<<, 

Hence we can also estimate 77 = C 4 7 = ( 4 \Dw £ \ 2 if x G dU. It follows that 

(39) sup\Dw £ \<C 

v 

for each V CC (JUT, the constant C depending only on V and not on e. 

5. As w £ = on T, we deduce from (39) that 

(40) sup|w £ |<a 

v 
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In view of (39), (40) there exists a sequence e T — > such that 

w £r — > w uniformly on compact 
subsets of U U Y. 

It follows from (28) that 

(41) w = on T 
and 

Y n n 

(42) - a ij w Xi w Xj - ^ b i™*i = in [/, 

i,j=l i=l 

in the viscosity sense: the proof is a straightforward adaptation of the vanishing viscosity 
calculation in §VI.A. Since the PDE (42) holds a.e., we conclude that 

\Dw\ < C a.e. in U, 

and so 

w e C°' l (U). 

We must identify w. 

6. For this, we recall the definition 

(43) W (x) = ™lJ^[ - HytfWii*) ~ My(*)))*>} , 

the admissible class A defined by (26). Clearly then 

(44) w = on T. 
We claim that in fact 

Y n n 

(45) - Y a v w Xi w Xj - Y b i w *i = in [/, 

i,j=l i=l 

in the viscosity sense. To prove this, take a smooth function v and suppose 

(46) w — v has a local maximum at a point x £ U. 
We must show 

Y n n 

(47) - Y <* t3 VxiV Xj ~ Y biVx * - at x °- 

i,j=l i=l 
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To establish (47), note that (46) implies 

(48) w(x) — v(x) < w(xo) — v(xo) if x G B(xo, r) 

for r small enough. 

Fix any a G M ra and consider the ODE 



y(s) = b(y(s)) + A(y(s))a (s > 0) 

y(0) = x . 



(49) 

Let t > be so small that y(t) G B(x ,r). Then (43) implies 



1 /" 

w (*o) < - / £ ^'(y( s ))(^ - fc(y))(fc - My))<** + ™(y(0)- 

70 »J=1 

Therefore (48), (49) give the inequality 

u(ar ) - v(y(t)) < w(x Q ) - w(y(t)) 

< Uo EIj=i aii (y( s ))«i«i rfs - 

Divide by t and let t -> 0, recalling the ODE (49): 

— _Dt> • (6 + Act) < -^(Aa) • a at x . 
This is true for all vectors a G K", and consequently 

(50) sup (-Dv ■ (b + Aa) - \{Aa) -a] < at x . 

But the supremum above is attained for 

a = —Dv, 

and so (50) says 



-(ADv) ■ Dv - b ■ Dv < at x . 



This is (47). 

7. Next let us suppose 



(51) w — v has a local minimum at a point x G U, 



204 



and prove 

Y n n 

(52) - ^ d'VxiVxj ~ ^2 biVx i - at x °- 

i,j=l i=l 

To verify this inequality, we assume instead that (52) fails, in which case 

^ n n 

(53) - ^ alJv xiV Xj ~ ^2 hiVx * <-° <Q near x o 



2 

i,j=i i=i 



for some constant 6 > 0. Now take a small time t > 0. Then the definition (43) implies that 
there exists y(-) G A such that 



w 



x ) > w(y(t)) + \fj2 a ^)(y* - wxfc - My))*> - ?*■ 



In view of (51), therefore 

v(x ) - v(y(t)) > w(x ) - w(y(t)) 

> \ lo Ey=i a°'(y)(yi - fc(y))fo - My))* - !*■ 



(54) 
Now define 
so that 
(55) 
Then 



^^^(y^Ky-tyy)); 
y(s) = b(y(s))+^(y(s))a(s) (s > 0) 

y(0) = x . 



«(x ) - «(y(0) = -JJi«(yW)* 

-t 



) ds 

- £ ^(y(«)) • [b(y(*)) + A(y( a ))a(s)]ds. 



Combine this identity with (54): 



-ft < £{-Dv)-(b(y)+Aa{*))-k(M*)) •«(*))*> 

< | *sup Q {(-Du) • (b(y) + Aa) - |(Aa) • ads 
= $l\{ADv)- Dv -b- Dv ds 

< -ot, 

according to (53), provided t > is small enough. This is a contradiction however. 
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We have verified (52). 

8. To summarize, we have so far shown that w £r — > w, w solving the nonlinear first order 
PDE (42). Likewise w defined by (43) solves the same PDE. In addition w = w = on T. 
We wish finally to prove that 

(56) w = w in U. 

This is in fact true: the proof in [E-I] utilizes various viscosity solution tricks as well as the 
condition (19). We omit the details here. 

Finally then, our main assertion (24) follows from (56). □ 
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Appendix A: Units and constants 



1. Fundamental quantities 



Units 



time 
length 

mass 

temperature 
quantity 

2. Derived quantities 

force 

pressure 

work, energy 

power 

entropy 

heat 



force/unit area 

force x distance 
pressure x volume 

power = rate of work 



pressure 
work = 



seconds (s) 
meters (m) 
kilogram (kg) 
Kelvin (K) 
mole (mol) 



Units 



kg ■ m • s = newton (N) 
N ■ rrT 2 = pascal (Pa) 
N ■ m = joule (J) 
J . s' 1 = watt (W) 
J-K- 1 

4.1840 J = calorie 



3. Constants 

R = gas constant = 8.314 J ■ mol^ 1 ■ K^ 1 

k = Boltzmann's constant = 1.3806 x 10~ 23 J ■ K~ l 

Na = Avogadro's number = R/k = 6.02 x 10 23 mol -1 



207 



Appendix B: Physical axioms 



We record from Callen [C, p. 283-284] these physical "axioms" for a thermal system in 
equilibrium. 

Postulate I. "There exist particular states (called equilibrium states) that, macroscopi- 
cally, are characterized completely by the specification of the internal energy E and a set of 
extensive parameters Xi, . . . , X m , later to be specifically enumerated." 

Postulate II. "There exists a function (called the entropy) of the extensive parameters, 
defined for all equilibrium states, and having the following property. The values assumed by 
the extensive parameters in the absence of a constraint are those that maximize the entropy 
over the manifold of constrained equilibrium states." 

Postulate III. "The entropy of a composite system is additive over the constituent subsys- 
tems (whence the entropy of each constituent system is a homogeneous first-order function 
of the extensive parameters). The entropy is continuous and differentiable and is a mono- 
tonically increasing function of the energy." 

Postulate IV. "The entropy of any system vanishes in the state for which T = (dE/dS)x 1 ,...,x m = 
0." 

These statements are quoted verbatim, except for minor changes of notation. Postulate 
IV is the Third Law of thermodynamics, and is not included in our models. 



208 



References 

[A] L. Arnold, Stochastic Differential Equations, Wiley. 

[B-G-L] C. Bardos, F. Golse and D. Levermore, Fluid dynamic limits of kinetic equations 
I, J. Stat. Physics 63 (1991), 323-344. 

[BN] L. Breiman, Probability, Addison-Wesley, 1968. 

[B-S] S. Bamberg and S. Sternberg, A Course in Mathematics for Students of Physics, 
Vol 2, Cambridge, 1990. 

[B-T] S. Bharatha and C. Truesdell, Classical Thermodynamics as a Theory of Heat 
Engines, Springer, 1977. 

[C] H. Callen, Thermodynamics and an Introduction to Thermo statistics (2nd ed.), 
Wiley, 1985. 

[CB] B. Chow, On Harnack's inequality and entropy for Gaussian curvature flow, Comm. 
Pure Appl. Math. 44 (1991), 469-483. 

[C-E-L] M. G. Crandall, L. C. Evans and P. L. Lions, Some properties of viscosity solutions 
of Hamilton-Jacobi equations, Trans. AMS 282 (1984), 487-502. 

[C-L] M. G. Crandall and P. L. Lions, Viscosity solutions of Hamilton-Jacobi equations, 
Trans. AMS 277 (1983), 1-42. 

[C-N] B. Coleman and W. Noll, The thermodynamics of elastic materials with heat con- 
duction and viscosity, Arch. Rat. Mech. Analysis 13 (1963), 167-178. 

[C-O-S] B. Coleman, D. Owen and J. Serrin, The second law of thermodynamics for systems 
with approximate cycles, Arch. Rat. Mech. Analysis 77 (1981), 103-142. 

[D] W. Day, Entropy and Partial Differential Equations, Pitman Research Notes in 
Mathematics, Series 295, Longman, 1993. 

[DA] E. E. Daub, Maxwell's demon, in [L-R]. 

[D-S] W. Day and M. Silhavy, Efficiency and existence of entropy in classical thermody- 
namics, Arch. Rat. Mech. Analysis 66 (1977), 73-81. 

[D-Z] A. Dembo and O. Zeitouni, Large Deviation Techniques and Applications, Jones 
and Barlett Publishers, 1993. 

[EL] R. S. Ellis, Entropy, Large Deviations and Statistical Mechanics, Springer, 1985. 



209 



[El] L. C. Evans, Partial Differential Equations, AMS Press. 

[E2] L. C. Evans, The perturbed test function method for viscosity solutions of nonlinear 
PDE, Proc. Royal Soc. Edinburgh 111 (1989), 359-375. 

[E-G] L. C. Evans and R. F. Gariepy, Measure Theory and Fine Properties of Functions, 
CRC Press, 1992. 

[E-I] L. C. Evans and H. Ishii, A PDE approach to some asymptotic problems concerning 
random differential equations with small noise intensities, Ann. Inst. H. Poincare 
2 (1985), 1-20. 

[F] E. Fermi, Thermodynamics, Dover, 1956. 

[F-Ll] M. Feinberg and R. Lavine, Thermodynamics based on the Hahn-Banach Theorem: 
the Clausius inequality, Arch. Rat. Mech. Analysis 82 (1983), 203-293. 

[F-L2] M. Feinberg and R. Lavine, Foundations of the Clausius-Duhem inequality, Ap- 
pendix 2A of [TR]. 

[FL] W. H. Fleming, Exit probabilities and optimal stochastic control, Appl. Math. 
Optimization 4 (1978), 327-346. 

[FR] M. I. Freidlin, Functional Integration and Partial Differential Equations, Princeton 
University Press, 1985. 

[F-W] M. I. Freidlin and A. D. Wentzell, Random Perturbations of Dynamical Systems, 
Springer, 1984. 

[G] F. R. Gantmacher, The Theory of Matrices, Chelsea, 1960. 

[GU] M. Gurtin, Thermodynamics of Evolving Phase Boundaries in the Plane, Oxford, 
1993. 

[G-W] M. Gurtin and W. Williams, An axiomatic foundation for continuum thermody- 
namics, Arch. Rat. Mech. Analysis 26 (1968), 83-117. 

[H] R. Hamilton, Remarks on the entropy and Harnack estimates for Gauss curvature 
flow, Comm. in Analysis and Geom. 1 (1994), 155-165. 

[HU] K. Huang, Statistical Mechanics (2nd ed.), Wiley, 1987. 

[J] E. T. Jaynes, E. T. Jaynes: Papers on Probability, Statistics and Statistical Physics 

(ed. by R. D. Rosenkrantz), Reidel, 1983. 



210 



[vK-L] N. G. van Kampen and J. J. Lodder, Constraints, Amer. J. Physics 52 (1984), 
419-424. 

[L-P-S] P. L. Lions, B. Perthame and P. E. Souganidis, Existence and stability of entropy 
solutions for the hyperbolic systems of isentropic gas dynamics in Eulerian and 
Lagrandian coordinates, Comm. Pure Appl. Math. 49 (1996), 599-638. 

[L-P-Tl] P. L. Lions, B. Perthame and E. Tadmor, A kinetic formulation of multidimensional 
conservation laws and related equations, J. Amer. Math. Soc. 7 (1994), 169-191. 

[L-P-T2] P. L. Lions, B. Perthame and E. Tadmor, Kinetic formulation of isentropic gas 
dynamics and p-systems, Comm. Math. Physics 163 (1994), 415-431. 

[L-R] H. Leff and A. Rex (ed.), Maxwell's Demon: Entropy, Information, Computing, 
Princeton U. Press, 1990. 

[M-S] C.-S. Man and J. Serrin, Book in preparation. 

[OK] B. Oksendal, Stochastic Differential Equations, Springer, 1989. 

[O] D. Owen, A First Course in the Mathematical Foundations of Thermodynamics, 

Springer, 1984. 

[P-T] B. Perthame and E. Tadmor, A kinetic equation with kinetic entropy functions for 
scalar conservation laws, Comm. Math. Physics 136 (1991), 501-517. 

[P] M. A. Pinsky, Lectures on Random Evolution, World Scientific, 1992. 

[R] F. Rezakhanlou, Lecture notes from Math 279 (UC Berkeley). 

[SI] J. Serrin, Foundations of Classical Thermodynamics, Lecture Notes, Math. De- 
partment, U. of Chicago, 1975. 

[S2] J. Serrin, Conceptual analysis of the classical second laws of thermodynamics, 
Arch. Rat. Mech. Analysis 70 (1979), 353-371. 

[S3] J. Serrin, An outline of thermodynamical structure, in New Perspectives in Ther- 
modynamics (ed. by Serrin), Springer, 1986. 

[SE] M. J. Sewell, Maximum and Minimum Principles, Cambridge, 1987. 

[S] J. Smoller, Shock Waves and Reaction- Diffusion Equations, Springer. 

[ST] D. W. Strock, Probability Theory: An Analytic View, Cambridge University Press, 
1993. 



211 



[SZ] L. Szilard, "On the decrease of entropy in a thermodynamic system by the inter- 
vention of intelligent beings", in [L-R]. 

[T] C. J. Thompson, Mathematical Statistical Mechanics, Princeton University Press, 

1972. 

[TR] C. Truesdell, Rational Thermodynamics (2nd ed.), Springer. 

[T-M] C. Truesdell and R. G. Muncaster, Fundamentals of Maxwell's Kinetic Theory of 
a Simple Monotonic Gas, Academic Press, 1980. 

[T-N] C. Truesdell and W. Noll, Nonlinear Field Theories of Mechanics, Springer, 1965. 

[V] S. R. S. Varadhan, Large Deviations and Applications, SIAM, 1984. 

[W] A. Wightman, "Convexity and the notion of equilibrium state in thermodynamics 
and statistical mechanics", Introduction to R. B. Israel, Convexity in the Theory 
of Lattice Gases, Princeton U. Press, 1979. 

[Z] M. Zemansky, Heat and Thermodynamics, McGraw-Hill. 



212 



